Large Language Models

PILAF: Optimal Human Preference Sampling for Reward Modeling

6 February 2025·2374 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NYU

PILAF optimizes human feedback in reward modeling for better LLM alignment by using a novel response sampling strategy that aligns reward modeling with value optimization.

MAGA: MAssive Genre-Audience Reformulation to Pretraining Corpus Expansion

6 February 2025·2840 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 ByteDance

MAGA reformulates existing corpora to massively expand LLM pretraining data, boosting performance across various model sizes while maintaining quality.

Linear Correlation in LM's Compositional Generalization and Hallucination

6 February 2025·8299 words·39 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC San Diego

Language models surprisingly exhibit linear relationships when composing knowledge; this linearity, resilient to fine-tuning, predicts compositional generalization and hallucination.

CMoE: Fast Carving of Mixture-of-Experts for Efficient LLM Inference

6 February 2025·1697 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong

CMOE efficiently transforms dense LLMs into sparse MoE architectures via expert carving, enabling fast inference without extensive retraining.

BOLT: Bootstrap Long Chain-of-Thought in Language Models without Distillation

6 February 2025·2217 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Salesforce AI Research

BOLT bootstraps Long Chain-of-Thought reasoning in LLMs without distillation, achieving impressive results across various benchmarks.

Beyond Prompt Content: Enhancing LLM Performance via Content-Format Integrated Prompt Optimization

6 February 2025·2451 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Microsoft Research

Researchers jointly optimize prompt content and format to significantly boost Large Language Model (LLM) performance.

Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning

5 February 2025·3144 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Meta AI

Boosting language model reasoning: A novel hybrid approach using latent tokens drastically shortens reasoning traces, improving model performance and efficiency.

Teaching Language Models to Critique via Reinforcement Learning

5 February 2025·4328 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Hong Kong

LLMs learn to critique and refine their output via reinforcement learning, significantly improving code generation.

LIMO: Less is More for Reasoning

5 February 2025·2691 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Generative Al Research

LIMO: Few examples unlock complex reasoning in LLMs, challenging assumptions about data-hungry models and achieving state-of-the-art results with minimal training.

Gold-medalist Performance in Solving Olympiad Geometry with AlphaGeometry2

5 February 2025·4637 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Google DeepMind

AlphaGeometry2 surpasses average IMO gold medalists in solving geometry problems!

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

5 February 2025·5882 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 T-Tech

Researchers unveil a data-free method to visualize and control feature flow in LLMs, enhancing interpretability and enabling targeted model steering.

Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search

4 February 2025·3854 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 MIT

Satori: A novel 7B LLM achieves state-of-the-art mathematical reasoning via autoregressive search.

QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search

4 February 2025·2983 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Los Angeles

QLASS boosts language agent inference by using Q-values to guide a stepwise search, improving efficiency and performance even with limited data.

On Teacher Hacking in Language Model Distillation

4 February 2025·2783 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Google DeepMind

Language model distillation suffers from ’teacher hacking’, where student models over-optimize flawed teacher models, degrading true performance. This paper identifies this issue and offers effective…

ZebraLogic: On the Scaling Limits of LLMs for Logical Reasoning

3 February 2025·2452 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Washington

LLMs struggle with complex logical reasoning; ZebraLogic benchmark reveals a ‘curse of complexity’, highlighting inherent limitations and guiding future research.

The Differences Between Direct Alignment Algorithms are a Blur

3 February 2025·3273 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 T-Tech

Direct alignment algorithms are a blur, but this paper shows how a simple SFT phase and a scaling parameter significantly improve alignment quality, regardless of the specific reward function used.

Process Reinforcement through Implicit Rewards

3 February 2025·3889 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

PRIME (Process Reinforcement through IMplicit rEwards) revolutionizes LLM training by efficiently using implicit process rewards from online policy rollouts and outcome labels, significantly boosting …

PlotGen: Multi-Agent LLM-based Scientific Data Visualization via Multimodal Feedback

3 February 2025·523 words·3 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 IGDTUW, Delhi

PlotGen: A novel multi-agent LLM framework automates accurate scientific data visualization via multimodal feedback, boosting novice productivity and improving visualization accuracy.

PhD Knowledge Not Required: A Reasoning Challenge for Large Language Models

3 February 2025·1257 words·6 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Wellesley College

New benchmark challenges LLMs with general knowledge puzzles, revealing reasoning gaps and suggesting improvements for future models.

Lifelong Sequential Knowledge Editing without Model Degradation

3 February 2025·13067 words·62 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Berkeley

ENCORE enables lifelong sequential knowledge editing in LLMs without performance loss, achieving 10,000 edits while maintaining downstream accuracy.