Natural Language Processing

Can LLMs Implicitly Learn Numeric Parameter Constraints in Data Science APIs?

26 September 2024·2762 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign

LLMs struggle to reliably generate valid data science code due to a lack of true understanding of numerical constraints in APIs, despite seemingly mastering common patterns through extensive training.

Can large language models explore in-context?

26 September 2024·4498 words·22 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Microsoft Research

LLMs struggle with in-context exploration, needing substantial prompt engineering or training interventions to effectively explore multi-armed bandit environments.

Can Large Language Model Agents Simulate Human Trust Behavior?

26 September 2024·3567 words·17 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Oxford

LLM agents surprisingly exhibit human-like trust behavior, especially GPT-4, paving the way for simulating complex human interactions in various applications.

Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?

26 September 2024·7400 words·35 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Hong Kong Baptist University

LLMs struggle with noisy rationales in chain-of-thought prompting. This paper introduces the NoRa dataset, showing that existing methods struggle. A new method, CD-CoT, significantly improves accura…

Can Language Models Learn to Skip Steps?

26 September 2024·2929 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UC Santa Barbara

Language models learn to skip steps in reasoning, improving efficiency and generalization, showcasing emergent human-like cognitive abilities.

Can Graph Learning Improve Planning in LLM-based Agents?

26 September 2024·2929 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Peking University

GNNs enhance LLM-based task planning by improving the ability to process task graphs, surpassing existing solutions even without training.

Calibrating Reasoning in Language Models with Internal Consistency

26 September 2024·2546 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University

LLMs’ reasoning can be improved by using internal consistency to calibrate their outputs.

Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

26 September 2024·2104 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Artificial Intelligence Research Laboratory, Pennsylvania State University

Cal-DPO calibrates implicit rewards in contrastive preference learning, dramatically improving large language model alignment with human preferences.

Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers

26 September 2024·2873 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 CLAIRE, EPFL

Training large language models efficiently is key; this paper shows how using structured feedforward layers and a novel training regime significantly reduces computational costs and improves training …

Bridging semantics and pragmatics in information-theoretic emergent communication

26 September 2024·1593 words·8 mins· loading · loading

Natural Language Processing Dialogue Systems 🏢 Apple

AI agents learn human-like communication, combining semantic categorization and pragmatic context-sensitive reasoning, through a novel information-theoretic framework.

Bridge-IF: Learning Inverse Protein Folding with Markov Bridges

26 September 2024·1691 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Zhejiang University

Bridge-IF, a novel generative diffusion model, excels at inverse protein folding by learning probabilistic dependencies between protein structures and sequences, significantly outperforming existing m…

Bridge the Modality and Capability Gaps in Vision-Language Model Selection

26 September 2024·3390 words·16 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 State Key Laboratory for Novel Software Technology, Nanjing University

SWAB bridges modality and capability gaps in Vision-Language Model selection using optimal transport, enabling accurate prediction of VLM performance without images.

Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model

26 September 2024·1801 words·9 mins· loading · loading

AI Generated Natural Language Processing Recommendation Systems 🏢 University of Science and Technology of China

DDSR: a novel sequential recommendation model uses fuzzy sets and discrete diffusion to capture user behavior randomness, outperforming existing methods.

Boosting Weakly Supervised Referring Image Segmentation via Progressive Comprehension

26 September 2024·5057 words·24 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 City University of Hong Kong

PCNet boosts weakly-supervised referring image segmentation by progressively processing textual cues, mimicking human comprehension, and significantly improving target localization.

Boosting the Potential of Large Language Models with an Intelligent Information Assistant

26 September 2024·1837 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

Boosting LLMs with an intelligent information assistant, ASSISTRAG, significantly improves accuracy and reasoning, especially for less advanced models.

Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing

26 September 2024·3128 words·15 mins· loading · loading

AI Generated Natural Language Processing Semi-Supervised Learning 🏢 University of Science and Technology of China

ViSu boosts semi-supervised scene text recognition by using an online generation strategy for diverse synthetic data and a novel character alignment loss to improve model generalization and robustness…

BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling

26 September 2024·2612 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Department of Statistics, University of Chicago

BoNBON alignment optimizes large language model (LLM) outputs towards human preferences using best-of-n sampling, maximizing win-rate against base models with minimal off-target impact.

BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models

26 September 2024·3160 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Rutgers University

BLoB: Bayesian Low-Rank Adaptation by Backpropagation enhances LLMs by jointly tuning mean and covariance of parameters during fine-tuning, improving uncertainty estimation and generalization.

BitDelta: Your Fine-Tune May Only Be Worth One Bit

26 September 2024·2156 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 MIT

BitDelta drastically shrinks fine-tuned LLMs by quantizing their weight deltas to just one bit, achieving 10x memory reduction and latency improvements without sacrificing performance.

BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens

26 September 2024·2196 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Purdue University

BISCOPE: AI-generated text detection using a novel bidirectional method that outperforms existing techniques by leveraging both prediction and memorization of preceding tokens.