Skip to main content

Natural Language Processing

PAFT: Prompt-Agnostic Fine-Tuning
·3569 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
PAFT dynamically adjusts prompts during LLM fine-tuning, improving model robustness and generalization across diverse prompts without sacrificing performance or efficiency.
MoBA: Mixture of Block Attention for Long-Context LLMs
·3939 words·19 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Moonshot AI
MoBA: Mixture of Block Attention enables efficient long-context LLMs by dynamically selecting relevant blocks, improving performance without compromising efficiency.
How Much Do LLMs Hallucinate across Languages? On Multilingual Estimation of LLM Hallucination in the Wild
·3895 words·19 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 WΓΌNLP, CAIDAS, University of WΓΌrzburg
Multilingual LLMs Hallucinate! This study measures hallucination across 30 languages.
HeadInfer: Memory-Efficient LLM Inference by Head-wise Offloading
·4689 words·23 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 California Institute of Technology
HEADINFER achieves memory-efficient LLM inference by cleverly offloading key-value cache to the CPU, enabling 4 million token inference on a single consumer GPU.
Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge
·3819 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 City University of Hong Kong
Crowd-based comparative evaluation significantly boosts LLM-as-a-judge accuracy by using crowd responses to expose deeper details, resulting in more reliable and efficient auto-evaluation.
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity
·2814 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 AIRI
LLMs can losslessly compress 1568 tokens into a single vector, surpassing prior methods by two orders of magnitude.
System Message Generation for User Preferences using Open-Source Models
·3777 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Upstage AI
SYSGEN: A novel pipeline generates effective system messages for LLMs using open-source models, improving model responses and addressing data scarcity in supervised fine-tuning.
SAFE-SQL: Self-Augmented In-Context Learning with Fine-grained Example Selection for Text-to-SQL
·3833 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Question Answering 🏒 Department of Artificial Intelligence, Chung-Ang University
SAFE-SQL boosts Text-to-SQL accuracy by intelligently generating and filtering self-augmented examples for in-context learning, surpassing existing methods in challenging scenarios.
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?
·2710 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 School of Computer Science, Fudan University
Contrary to popular belief, longer reasoning chains don’t always boost Large Language Model (LLM) accuracy; this research reveals that parallel scaling with shorter solutions outperforms sequential sc…
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning
·2524 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Xi'an Jiaotong University
PhysReason benchmark evaluates physics-based reasoning in LLMs, revealing critical limitations and guiding future improvements.
Large Language Models and Mathematical Reasoning Failures
·397 words·2 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 KTH Royal Institute of Technology
Large language models struggle with mathematical word problems, demonstrating flaws in reasoning despite achieving high accuracy; a new study highlights these persistent gaps in generalization abiliti…
Language Complexity Measurement as a Noisy Zero-Shot Proxy for Evaluating LLM Performance
·1604 words·8 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 KTH Royal Institute of Technology
LLMs’ performance on language complexity tasks (LIX & ADD) reveals a strong correlation with general capabilities, suggesting complexity metrics as noisy zero-shot proxies for model evaluation.
Continuous Diffusion Model for Language Modeling
·1809 words·9 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Text Generation 🏒 Korea Advanced Institute of Science and Technology
RDLM: A novel continuous diffusion model for language modeling leverages the geometry of categorical distributions, outperforming existing discrete approaches and approaching autoregressive model perf…
Building A Proof-Oriented Programmer That Is 64% Better Than GPT-4o Under Data Scarsity
·2347 words·12 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of Illinois Urbana-Champaign
PoPilot, a novel proof-oriented programming LLM, outperforms GPT-40 by 64% under data scarcity by using synthetic data augmentation.
Atom of Thoughts for Markov LLM Test-Time Scaling
·2660 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Hong Kong University of Science and Technology
Atom of Thoughts (AOT) revolutionizes LLM test-time scaling by decomposing complex reasoning into independent sub-questions, drastically reducing computation while maintaining high accuracy.
Talk Structurally, Act Hierarchically: A Collaborative Framework for LLM Multi-Agent Systems
·3486 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Sony Group Corporation
TalkHier, a novel framework for LLM multi-agent systems, uses structured communication and hierarchical refinement to achieve state-of-the-art performance on various tasks, improving collaboration and…
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
·2722 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 DeepSeek-AI
NSA: a novel sparse attention mechanism achieves efficient long-context modeling by combining algorithmic innovations with hardware-aligned optimizations, surpassing full attention models across vario…
How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training
·7040 words·34 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Zhejiang University
LLMs’ knowledge acquisition is unveiled through the lens of evolving knowledge circuits, revealing how new knowledge integration depends on relevance to existing knowledge, exhibiting distinct phases …
FinMTEB: Finance Massive Text Embedding Benchmark
·3630 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Hong Kong University of Science and Technology
FinMTEB: A new benchmark reveals that general-purpose embedding models struggle in the finance domain; domain-specific models excel, and surprisingly, simple BoW outperforms sophisticated models on ce…
Dyve: Thinking Fast and Slow for Dynamic Process Verification
·1995 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Chinese University of Hong Kong
Dyve: A novel dynamic process verifier boosts LLM reasoning accuracy by cleverly combining fast, immediate checks with deeper, slower analyses for complex steps, achieving significant performance gain…