Natural Language Processing
O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning
·2220 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Shenzhen Campus of Sun Yat-Sen University
O1-Pruner efficiently prunes long-thought reasoning in LLMs by harmonizing reasoning length and accuracy via fine-tuning, significantly reducing inference time without sacrificing performance.
Kimi k1.5: Scaling Reinforcement Learning with LLMs
·1386 words·7 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 OpenAI
Kimi K1.5: A Multimodal LLM trained with RL achieves state-of-the-art reasoning by scaling long context RL training and improving policy optimization.
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning
·2866 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 DeepSeek-AI
DeepSeek-R1 significantly improves LLM reasoning by using reinforcement learning, achieving performance comparable to OpenAI’s top models while addressing previous challenges of poor readability and l…
Autonomy-of-Experts Models
·2476 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tencent AI Lab
Revolutionizing large language models, Autonomy-of-Experts (AoE) empowers individual expert modules to autonomously select inputs, eliminating routers and boosting both efficiency and accuracy.
MMVU: Measuring Expert-Level Multi-Discipline Video Understanding
·6574 words·31 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Question Answering
🏢 Yale NLP
MMVU: a new benchmark pushes multimodal video understanding to expert level, revealing limitations of current models and paving the way for more advanced AI.
Debate Helps Weak-to-Strong Generalization
·2415 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tongyi Lab
Debate-enhanced weak supervision boosts AI alignment by combining strong and weak models, enabling safer and more reliable AI systems.
Redundancy Principles for MLLMs Benchmarks
·4576 words·22 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Shanghai AI Lab
This research proposes principles and a framework to tackle redundancy in MLLM benchmarks, enhancing efficiency and guiding future development.
Reasoning Language Models: A Blueprint
·3562 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 ETH Zurich
Democratizing advanced reasoning in AI, this blueprint introduces a modular framework for building Reasoning Language Models (RLMs), simplifying development and enhancing accessibility.
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
·2333 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 University of Illinois Urbana-Champaign
Mobile-Agent-E: A self-evolving mobile assistant conquering complex tasks with hierarchical agents and a novel self-evolution module, significantly outperforming prior approaches.
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training
·4105 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Fudan University
Agent-R: A novel self-training framework enables language model agents to learn from errors by dynamically constructing training data that corrects erroneous actions, resulting in significantly improv…
IntellAgent: A Multi-Agent Framework for Evaluating Conversational AI Systems
·1691 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Dialogue Systems
🏢 Plurai
IntellAgent: a novel open-source framework automating diverse conversational AI evaluation via policy-driven graph modeling, event generation, and user-agent simulations, enabling fine-grained diagnos…
Step-KTO: Optimizing Mathematical Reasoning through Stepwise Binary Feedback
·704 words·4 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Meta GenAI
STEP-KTO: A novel training framework boosts LLMs’ mathematical reasoning by providing binary feedback on both intermediate steps and final answers. This ensures logical reasoning trajectories and impr…
PaSa: An LLM Agent for Comprehensive Academic Paper Search
·4507 words·22 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Peking University
PaSa: An LLM agent autonomously performs comprehensive academic paper searches, outperforming existing methods by efficiently combining search tools, paper reading, and citation analysis, optimized vi…
Evolving Deeper LLM Thinking
·7089 words·34 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Google DeepMind
Mind Evolution, a novel evolutionary search strategy, significantly boosts Large Language Model (LLM) problem-solving by generating, recombining, and refining candidate solutions via an LLM, outperfor…
ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario
·3933 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tsinghua University
ComplexFuncBench, a new benchmark, rigorously evaluates LLMs’ complex function-calling abilities across real-world scenarios involving multi-step processes, constraints, and long contexts.
Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models
·1945 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tsinghua University
This survey paper explores the exciting new frontier of Large Reasoning Models (LRMs), focusing on how reinforcement learning and clever prompting techniques are boosting LLMs’ reasoning capabilities.
Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong
·1926 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Nanjing University of Aeronautics and Astronautics
LLM reasoning boosts self-confidence, even when answers are wrong, highlighting limitations in current evaluation metrics.
Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators
·2252 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Dialogue Systems
🏢 Baichuan Inc.
AI-powered medical consultations often struggle with the inquiry phase. This paper presents a novel patient simulator trained on real interactions, revealing that effective inquiry significantly impac…
Bridging Language Barriers in Healthcare: A Study on Arabic LLMs
·1632 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 M42 Health
Arabic LLMs struggle with medical tasks; this study reveals optimal language ratios in training data for improved performance, highlighting challenges in simply translating medical data for different …
RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation
·5724 words·27 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Princeton University
RLHS, a novel alignment algorithm, leverages simulated hindsight feedback to mitigate misalignment in RLHF, significantly improving AI’s alignment with human values and goals.