Skip to main content

Natural Language Processing

Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
·1708 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KAIST
Sketch-of-Thought(SoT) reduces LLM token usage by up to 76% while maintaining (or improving) accuracy via cognitive-inspired sketching.
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
·3585 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Renmin University of China
R1-Searcher: RL enhances LLMs by incentivizing autonomous search, outperforming RAG methods, even GPT-4o-mini!
Linear-MoE: Linear Sequence Modeling Meets Mixture-of-Experts
·3804 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory
Linear-MoE: Integrates Linear Sequence Modeling with Mixture-of-Experts, achieving efficiency gains and competitive performance in large language models.
TinyR1-32B-Preview: Boosting Accuracy with Branch-Merge Distillation
·570 words·3 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Peking University
TinyR1-32B-Preview: A novel branch-merge distillation approach that significantly enhances model accuracy and reduces computational costs for LLMs.
Shifting Long-Context LLMs Research from Input to Output
·1724 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Text Generation 🏢 Singapore University of Technology and Design
Time to focus on LLM’s long-form outputs! This paper advocates for research on generating high-quality, long, and coherent text.
More Documents, Same Length: Isolating the Challenge of Multiple Documents in RAG
·1723 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 School of Computer Science and Engineering
More documents can hurt RAG performance, even with same length!
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs
·3432 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Machine Translation 🏢 Shanghai AI Laboratory
LLMs show translationese due to supervised training biases. Polishing references and filtering unnatural instances can mitigate this issue.
IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval
·5266 words·25 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Information Extraction 🏢 School of Advanced Interdisciplinary Sciences, University of Chinese Academy of Sciences
IFIR: a new benchmark for instruction-following retrieval in expert domains, revealing current model limitations.
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion
·449 words·3 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 School of Computer Science and Engineering, Sun Yat-Sen University, China
FuseChat-3.0: Heterogeneous model fusion boosts LLM performance via preference optimization, creating efficient and powerful language models.
Beyond RAG: Task-Aware KV Cache Compression for Comprehensive Knowledge Reasoning
·2528 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 SAP Labs
Task-aware KV cache compression enables efficient knowledge reasoning in LLMs.
An Empirical Study on Eliciting and Improving R1-like Reasoning Models
·3690 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Renmin University of China
This paper explores and improves R1-like reasoning models through RL and tool manipulation, achieving significant accuracy gains.
Process-based Self-Rewarding Language Models
·3066 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Nanjing University
Process-based Self-Rewarding advances LLMs, surpassing human reasoning in math by step-wise self-evaluation.
Wikipedia in the Era of LLMs: Evolution and Risks
·3967 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Huazhong University of Science and Technology
LLMs modestly affect Wikipedia, subtly altering content and potentially skewing NLP benchmarks.
QE4PE: Word-level Quality Estimation for Human Post-Editing
·6157 words·29 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Machine Translation 🏢 CLCG, University of Groningen
QE4PE: Word-level QE’s impact on MT post-editing with 42 pro-editors across English-Italian/Dutch is investigated. Usability&accuracy challenges in professional workflows are underlined.
Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs
·2943 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University
Mask-DPO: Fine-grained Factuality Alignment improves LLMs’ factuality by masking sentence-level errors during DPO training for enhanced knowledge alignment.
LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation
·4618 words·22 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Oxford
LINGOLY-TOO: A new benchmark to disentangle memorization from reasoning in LLMs using linguistic templatization and orthographic obfuscation.
Iterative Value Function Optimization for Guided Decoding
·2523 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Text Generation 🏢 Shanghai Artificial Intelligence Laboratory
IVO: Iterative Value Function Optimization for Guided Decoding
Word Form Matters: LLMs' Semantic Reconstruction under Typoglycemia
·2734 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 MBZUAI
LLMs primarily rely on word form, unlike humans, when reconstructing semantics, indicating a need for context-aware mechanisms to enhance LLMs’ adaptability.
When an LLM is apprehensive about its answers -- and when its uncertainty is justified
·3209 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Skolkovo Institute of Science and Technology (Skoltech)
This paper investigates when LLMs are apprehensive and when their uncertainty is justified.
SampleMix: A Sample-wise Pre-training Data Mixing Strategey by Coordinating Data Quality and Diversity
·2929 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Meituan Group
SampleMix: Sample-wise Pre-training Data Mixing by Coordinating Data Quality and Diversity