Large Language Models

MoM: Linear Sequence Modeling with Mixture-of-Memories

19 February 2025·2764 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory

MoM: Enhancing linear sequence modeling via mixture-of-memories for improved recall and reduced memory interference.

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

19 February 2025·2370 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

LongPO: Self-evolve LLMs to excel in long contexts via short-to-long preference optimization, boosting performance without sacrificing short-context skills.

Craw4LLM: Efficient Web Crawling for LLM Pretraining

19 February 2025·3024 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

CRAW4LLM: Efficiently crawls web pages for LLM pretraining by prioritizing influence scores, boosting data quality & cutting crawling waste.

Autellix: An Efficient Serving Engine for LLM Agents as General Programs

19 February 2025·4705 words·23 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Berkeley

Autellix: Efficient LLM Serving for Agents

Think Inside the JSON: Reinforcement Strategy for Strict LLM Schema Adherence

18 February 2025·1174 words·6 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 MasterControl AI Research

ThinkJSON presents a reinforcement learning strategy to enforce strict schema adherence in LLM generation.

SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models

18 February 2025·2481 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KAIST

SafeRoute efficiently enhances LLM safety by adaptively using smaller and larger safety guard models, maximizing accuracy while minimizing costs.

Rethinking Diverse Human Preference Learning through Principal Component Analysis

18 February 2025·2799 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Rice University

Decomposed Reward Models (DRMs) extract diverse human preferences from binary comparisons using PCA, enabling flexible and interpretable LLM alignment.

Perovskite-LLM: Knowledge-Enhanced Large Language Models for Perovskite Solar Cell Research

18 February 2025·3084 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

Perovskite-LLM: a new knowledge-enhanced system boosts perovskite solar cell research by integrating a domain-specific knowledge graph, high-quality datasets, and specialized LLMs for superior knowled…

PAFT: Prompt-Agnostic Fine-Tuning

18 February 2025·3569 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

PAFT dynamically adjusts prompts during LLM fine-tuning, improving model robustness and generalization across diverse prompts without sacrificing performance or efficiency.

MoBA: Mixture of Block Attention for Long-Context LLMs

18 February 2025·3939 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Moonshot AI

MoBA: Mixture of Block Attention enables efficient long-context LLMs by dynamically selecting relevant blocks, improving performance without compromising efficiency.

How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild

18 February 2025·3895 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 WüNLP, CAIDAS, University of Würzburg

Multilingual LLMs Hallucinate! This study measures hallucination across 30 languages.

HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading

18 February 2025·4689 words·23 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 California Institute of Technology

HEADINFER achieves memory-efficient LLM inference by cleverly offloading key-value cache to the CPU, enabling 4 million token inference on a single consumer GPU.

Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge

18 February 2025·3819 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 City University of Hong Kong

Crowd-based comparative evaluation significantly boosts LLM-as-a-judge accuracy by using crowd responses to expose deeper details, resulting in more reliable and efficient auto-evaluation.

Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity

18 February 2025·2814 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 AIRI

LLMs can losslessly compress 1568 tokens into a single vector, surpassing prior methods by two orders of magnitude.

System Message Generation for User Preferences using Open-Source Models

17 February 2025·3777 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Upstage AI

SYSGEN: A novel pipeline generates effective system messages for LLMs using open-source models, improving model responses and addressing data scarcity in supervised fine-tuning.

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

17 February 2025·2710 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 School of Computer Science, Fudan University

Contrary to popular belief, longer reasoning chains don’t always boost Large Language Model (LLM) accuracy; this research reveals that parallel scaling with shorter solutions outperforms sequential sc…

PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning

17 February 2025·2524 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Xi'an Jiaotong University

PhysReason benchmark evaluates physics-based reasoning in LLMs, revealing critical limitations and guiding future improvements.

Large Language Models and Mathematical Reasoning Failures

17 February 2025·397 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KTH Royal Institute of Technology

Large language models struggle with mathematical word problems, demonstrating flaws in reasoning despite achieving high accuracy; a new study highlights these persistent gaps in generalization abiliti…

Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance

17 February 2025·1604 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KTH Royal Institute of Technology

LLMs’ performance on language complexity tasks (LIX & ADD) reveals a strong correlation with general capabilities, suggesting complexity metrics as noisy zero-shot proxies for model evaluation.

Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity

17 February 2025·2347 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign

PoPilot, a novel proof-oriented programming LLM, outperforms GPT-40 by 64% under data scarcity by using synthetic data augmentation.