Skip to main content

Natural Language Processing

Can LLMs Implicitly Learn Numeric Parameter Constraints in Data Science APIs?
·2762 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign
LLMs struggle to reliably generate valid data science code due to a lack of true understanding of numerical constraints in APIs, despite seemingly mastering common patterns through extensive training.
Can large language models explore in-context?
·4498 words·22 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Microsoft Research
LLMs struggle with in-context exploration, needing substantial prompt engineering or training interventions to effectively explore multi-armed bandit environments.
Can Large Language Model Agents Simulate Human Trust Behavior?
·3567 words·17 mins· loading · loading
Natural Language Processing Large Language Models 🏢 University of Oxford
LLM agents surprisingly exhibit human-like trust behavior, especially GPT-4, paving the way for simulating complex human interactions in various applications.
Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?
·7400 words·35 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Hong Kong Baptist University
LLMs struggle with noisy rationales in chain-of-thought prompting. This paper introduces the NoRa dataset, showing that existing methods struggle. A new method, CD-CoT, significantly improves accura…
Can Language Models Learn to Skip Steps?
·2929 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 UC Santa Barbara
Language models learn to skip steps in reasoning, improving efficiency and generalization, showcasing emergent human-like cognitive abilities.
Can Graph Learning Improve Planning in LLM-based Agents?
·2929 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Peking University
GNNs enhance LLM-based task planning by improving the ability to process task graphs, surpassing existing solutions even without training.
Calibrating Reasoning in Language Models with Internal Consistency
·2546 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University
LLMs’ reasoning can be improved by using internal consistency to calibrate their outputs.
Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment
·2104 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Artificial Intelligence Research Laboratory, Pennsylvania State University
Cal-DPO calibrates implicit rewards in contrastive preference learning, dramatically improving large language model alignment with human preferences.
Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers
·2873 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 CLAIRE, EPFL
Training large language models efficiently is key; this paper shows how using structured feedforward layers and a novel training regime significantly reduces computational costs and improves training …
Bridging semantics and pragmatics in information-theoretic emergent communication
·1593 words·8 mins· loading · loading
Natural Language Processing Dialogue Systems 🏢 Apple
AI agents learn human-like communication, combining semantic categorization and pragmatic context-sensitive reasoning, through a novel information-theoretic framework.
Bridge-IF: Learning Inverse Protein Folding with Markov Bridges
·1691 words·8 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Zhejiang University
Bridge-IF, a novel generative diffusion model, excels at inverse protein folding by learning probabilistic dependencies between protein structures and sequences, significantly outperforming existing m…
Bridge the Modality and Capability Gaps in Vision-Language Model Selection
·3390 words·16 mins· loading · loading
AI Generated Natural Language Processing Vision-Language Models 🏢 State Key Laboratory for Novel Software Technology, Nanjing University
SWAB bridges modality and capability gaps in Vision-Language Model selection using optimal transport, enabling accurate prediction of VLM performance without images.
Breaking Determinism: Fuzzy Modeling of Sequential Recommendation Using Discrete State Space Diffusion Model
·1801 words·9 mins· loading · loading
AI Generated Natural Language Processing Recommendation Systems 🏢 University of Science and Technology of China
DDSR: a novel sequential recommendation model uses fuzzy sets and discrete diffusion to capture user behavior randomness, outperforming existing methods.
Boosting Weakly Supervised Referring Image Segmentation via Progressive Comprehension
·5057 words·24 mins· loading · loading
AI Generated Natural Language Processing Vision-Language Models 🏢 City University of Hong Kong
PCNet boosts weakly-supervised referring image segmentation by progressively processing textual cues, mimicking human comprehension, and significantly improving target localization.
Boosting the Potential of Large Language Models with an Intelligent Information Assistant
·1837 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Tsinghua University
Boosting LLMs with an intelligent information assistant, ASSISTRAG, significantly improves accuracy and reasoning, especially for less advanced models.
Boosting Semi-Supervised Scene Text Recognition via Viewing and Summarizing
·3128 words·15 mins· loading · loading
AI Generated Natural Language Processing Semi-Supervised Learning 🏢 University of Science and Technology of China
ViSu boosts semi-supervised scene text recognition by using an online generation strategy for diverse synthetic data and a novel character alignment loss to improve model generalization and robustness…
BoNBoN Alignment for Large Language Models and the Sweetness of Best-of-n Sampling
·2612 words·13 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Department of Statistics, University of Chicago
BoNBON alignment optimizes large language model (LLM) outputs towards human preferences using best-of-n sampling, maximizing win-rate against base models with minimal off-target impact.
BLoB: Bayesian Low-Rank Adaptation by Backpropagation for Large Language Models
·3160 words·15 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Rutgers University
BLoB: Bayesian Low-Rank Adaptation by Backpropagation enhances LLMs by jointly tuning mean and covariance of parameters during fine-tuning, improving uncertainty estimation and generalization.
BitDelta: Your Fine-Tune May Only Be Worth One Bit
·2156 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 MIT
BitDelta drastically shrinks fine-tuned LLMs by quantizing their weight deltas to just one bit, achieving 10x memory reduction and latency improvements without sacrificing performance.
BiScope: AI-generated Text Detection by Checking Memorization of Preceding Tokens
·2196 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Purdue University
BISCOPE: AI-generated text detection using a novel bidirectional method that outperforms existing techniques by leveraging both prediction and memorization of preceding tokens.