Natural Language Processing

PaSa: An LLM Agent for Comprehensive Academic Paper Search

17 January 2025·4507 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Peking University

PaSa: An LLM agent autonomously performs comprehensive academic paper searches, outperforming existing methods by efficiently combining search tools, paper reading, and citation analysis, optimized vi…

Evolving Deeper LLM Thinking

17 January 2025·7089 words·34 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Google DeepMind

Mind Evolution, a novel evolutionary search strategy, significantly boosts Large Language Model (LLM) problem-solving by generating, recombining, and refining candidate solutions via an LLM, outperfor…

ComplexFuncBench: Exploring Multi-Step and Constrained Function Calling under Long-Context Scenario

17 January 2025·3933 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

ComplexFuncBench, a new benchmark, rigorously evaluates LLMs’ complex function-calling abilities across real-world scenarios involving multi-step processes, constraints, and long contexts.

Towards Large Reasoning Models: A Survey of Reinforced Reasoning with Large Language Models

16 January 2025·1945 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

This survey paper explores the exciting new frontier of Large Reasoning Models (LRMs), focusing on how reinforcement learning and clever prompting techniques are boosting LLMs’ reasoning capabilities.

Multiple Choice Questions: Reasoning Makes Large Language Models (LLMs) More Self-Confident Even When They Are Wrong

16 January 2025·1926 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Nanjing University of Aeronautics and Astronautics

LLM reasoning boosts self-confidence, even when answers are wrong, highlighting limitations in current evaluation metrics.

Exploring the Inquiry-Diagnosis Relationship with Advanced Patient Simulators

16 January 2025·2252 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Dialogue Systems 🏢 Baichuan Inc.

AI-powered medical consultations often struggle with the inquiry phase. This paper presents a novel patient simulator trained on real interactions, revealing that effective inquiry significantly impac…

Bridging Language Barriers in Healthcare: A Study on Arabic LLMs

16 January 2025·1632 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 M42 Health

Arabic LLMs struggle with medical tasks; this study reveals optimal language ratios in training data for improved performance, highlighting challenges in simply translating medical data for different …

RLHS: Mitigating Misalignment in RLHF with Hindsight Simulation

15 January 2025·5724 words·27 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Princeton University

RLHS, a novel alignment algorithm, leverages simulated hindsight feedback to mitigate misalignment in RLHF, significantly improving AI’s alignment with human values and goals.

URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics

8 January 2025·5517 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

URSA-7B: A new multimodal model significantly improves chain-of-thought reasoning in mathematics!

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

8 January 2025·3910 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Microsoft Research

Small language models can master complex math reasoning using self-evolved deep thinking via Monte Carlo Tree Search, surpassing larger models in performance.

LLM4SR: A Survey on Large Language Models for Scientific Research

8 January 2025·2870 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Texas at Dallas

LLMs revolutionize scientific research! This survey reveals their transformative potential across hypothesis discovery, experiment planning, writing, and peer review, guiding future research.

EpiCoder: Encompassing Diversity and Complexity in Code Generation

8 January 2025·5051 words·24 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

EpiCoder revolutionizes code generation by using feature trees to create diverse and complex training data, resulting in state-of-the-art performance on various benchmarks.

Building Foundations for Natural Language Processing of Historical Turkish: Resources and Models

8 January 2025·3036 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Named Entity Recognition 🏢 Boğaziçi University

First-ever resources (NER dataset, dependency treebank, and corpus) and models for historical Turkish NLP are introduced, significantly advancing research capabilities in this underexplored field.

PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides

7 January 2025·3721 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Text Generation 🏢 Chinese Academy of Sciences

PPTAgent, a novel two-stage framework, significantly improves automatic presentation generation by leveraging an edit-based workflow and a new evaluation metric, outperforming existing end-to-end meth…

Entropy-Guided Attention for Private LLMs

7 January 2025·5203 words·25 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 New York University

Boosting private LLMs’ efficiency and security, this research introduces an entropy-guided attention mechanism and PI-friendly layer normalization to mitigate the overheads of nonlinear operations.

Samba-asr state-of-the-art speech recognition leveraging structured state-space models

6 January 2025·1451 words·7 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Speech Recognition 🏢 SandLogic Technologies Pvt Ltd

Samba-ASR, a novel speech recognition model using Mamba architecture, surpasses existing transformer models in accuracy and efficiency, setting a new benchmark for future ASR research.

GeAR: Generation Augmented Retrieval

6 January 2025·1952 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Question Answering 🏢 Microsoft Research

GeAR, a new retrieval model, boosts accuracy by combining document retrieval with fine-grained information generation, leading to better understanding and improved localization.

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

6 January 2025·2687 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory

BoostStep enhances large language models’ mathematical abilities by refining single-step reasoning through a novel step-level in-context learning strategy, achieving significant improvements on variou…

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use

5 January 2025·3646 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 ByteDance

ToolHop: New benchmark dataset rigorously evaluates LLMs’ multi-hop tool use, revealing significant challenges and variations across different LLM families.

Test-time Computing: from System-1 Thinking to System-2 Thinking

5 January 2025·658 words·4 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Soochow University

Unlocking LLM potential: This paper surveys test-time computing, showing how it boosts reasoning abilities by shifting from reactive System-1 to deliberate System-2 thinking, paving the way for more p…