Natural Language Processing

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

26 January 2025·1758 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Peking University

ARWKV: A novel RNN-attention-based language model, distilled from a larger model, achieves strong performance using significantly fewer resources, opening a new path in efficient language model develo…

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

24 January 2025·2423 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 the Chinese University of Hong Kong, Shenzhen

RealCritic: A new benchmark effectively evaluates language models’ critique abilities using a closed-loop methodology, showcasing advanced reasoning models’ superiority in self and iterative critique.

Humanity's Last Exam

24 January 2025·2314 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Center for AI Safety

Humanity’s Last Exam (HLE): a groundbreaking multi-modal benchmark pushing the boundaries of large language model (LLM) capabilities, revealing a significant gap between current LLMs and human experts…

Chain-of-Retrieval Augmented Generation

24 January 2025·4155 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Microsoft Research

CoRAG, a novel Chain-of-Retrieval Augmented Generation model, dynamically refines queries for improved accuracy in multi-hop question answering, achieving state-of-the-art performance.

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

23 January 2025·8384 words·40 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Microsoft Research

SIGMA, a novel large language model, achieves up to 33.36% faster inference speeds by using DiffQKV attention, which differentially optimizes query, key, and value components in the attention mech…

Low-Rank Adapters Meet Neural Architecture Search for LLM Compression

23 January 2025·2154 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Intel Labs

Low-rank adapters combined with neural architecture search revolutionize LLM compression, enabling efficient fine-tuning and significantly reduced memory footprint.

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

22 January 2025·2592 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong

Large language models (LLMs) are rapidly evolving, yet often struggle to adapt to human preferences quickly. This paper introduces Test-Time Preference Optimization (TPO), an innovative framework that…

Pairwise RM: Perform Best-of-N Sampling with Knockout Tournament

22 January 2025·2172 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

Pairwise RM, a novel reward model with knockout tournaments, significantly boosts large language model accuracy in test-time scaling by comparing solution pairs, eliminating arbitrary scoring inconsis…

O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning

22 January 2025·2220 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shenzhen Campus of Sun Yat-Sen University

O1-Pruner efficiently prunes long-thought reasoning in LLMs by harmonizing reasoning length and accuracy via fine-tuning, significantly reducing inference time without sacrificing performance.

Kimi k1.5: Scaling Reinforcement Learning with LLMs

22 January 2025·1386 words·7 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 OpenAI

Kimi K1.5: A Multimodal LLM trained with RL achieves state-of-the-art reasoning by scaling long context RL training and improving policy optimization.

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

22 January 2025·2866 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 DeepSeek-AI

DeepSeek-R1 significantly improves LLM reasoning by using reinforcement learning, achieving performance comparable to OpenAI’s top models while addressing previous challenges of poor readability and l…

Autonomy-of-Experts Models

22 January 2025·2476 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tencent AI Lab

Revolutionizing large language models, Autonomy-of-Experts (AoE) empowers individual expert modules to autonomously select inputs, eliminating routers and boosting both efficiency and accuracy.

MMVU: Measuring Expert-Level Multi-Discipline Video Understanding

21 January 2025·6574 words·31 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Yale NLP

MMVU: a new benchmark pushes multimodal video understanding to expert level, revealing limitations of current models and paving the way for more advanced AI.

Debate Helps Weak-to-Strong Generalization

21 January 2025·2415 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tongyi Lab

Debate-enhanced weak supervision boosts AI alignment by combining strong and weak models, enabling safer and more reliable AI systems.

Redundancy Principles for MLLMs Benchmarks

20 January 2025·4576 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Lab

This research proposes principles and a framework to tackle redundancy in MLLM benchmarks, enhancing efficiency and guiding future development.

Reasoning Language Models: A Blueprint

20 January 2025·3562 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 ETH Zurich

Democratizing advanced reasoning in AI, this blueprint introduces a modular framework for building Reasoning Language Models (RLMs), simplifying development and enhancing accessibility.

Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks

20 January 2025·2333 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign

Mobile-Agent-E: A self-evolving mobile assistant conquering complex tasks with hierarchical agents and a novel self-evolution module, significantly outperforming prior approaches.

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

20 January 2025·4105 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Fudan University

Agent-R: A novel self-training framework enables language model agents to learn from errors by dynamically constructing training data that corrects erroneous actions, resulting in significantly improv…

IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems

19 January 2025·1691 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Dialogue Systems 🏢 Plurai

IntellAgent: a novel open-source framework automating diverse conversational AI evaluation via policy-driven graph modeling, event generation, and user-agent simulations, enabling fine-grained diagnos…

Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback

18 January 2025·704 words·4 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Meta GenAI

STEP-KTO: A novel training framework boosts LLMs’ mathematical reasoning by providing binary feedback on both intermediate steps and final answers. This ensures logical reasoning trajectories and impr…