Skip to main content

Large Language Models

ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition
·3118 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Shanghai Artificial Intelligence Laboratory
ResearchBench: Benchmarking LLMs for Scientific Discovery via Inspiration-Based Task Decomposition.
Large Language Model Agent: A Survey on Methodology, Applications and Challenges
·2979 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Peking University
This survey presents a methodology-centered taxonomy of LLM agent systems, linking design principles to emergent behaviors and identifying future research directions.
Challenging the Boundaries of Reasoning: An Olympiad-Level Math Benchmark for Large Language Models
·5419 words·26 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Renmin University of China
OlymMATH: A new Olympiad-level math benchmark rigorously tests LLMs’ reasoning, revealing limitations and paving the way for advancements.
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
·2301 words·11 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Shanghai AI Laboratory
Survey on improving efficiency in large reasoning models across language, multimodality, and beyond.
Think Twice: Enhancing LLM Reasoning by Scaling Multi-round Test-time Thinking
·1979 words·10 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 A-M-Team
Boost LLM reasoning by having models ‘Think Twice’! This novel method iteratively refines answers, significantly enhancing accuracy on complex tasks.
FFN Fusion: Rethinking Sequential Computation in Large Language Models
·3776 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 NVIDIA
FFN Fusion: Parallelizing sequential computation in large language models for significant speedups!
Lost in Cultural Translation: Do LLMs Struggle with Math Across Cultural Contexts?
·3575 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 155mv Research Lab
LLMs falter on culturally adapted math problems, revealing a critical cultural bias.
V-Seek: Accelerating LLM Reasoning on Open-hardware Server-class RISC-V Platforms
·1371 words·7 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Politecnico of Turin
V-SEEK accelerates LLM reasoning on open-hardware RISC-V platforms, achieving up to 3.0x speedup through optimized kernels and memory management.
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization
·8765 words·42 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Xi'an Jiaotong University
MARS: Optimizing prompts with multi-agent collaboration and Socratic learning for better LLM performance!
LEMMA: Learning from Errors for MatheMatical Advancement in LLMs
·4802 words·23 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
LEMMA: LLMs learn math via mistake analysis and correction, boosting performance without external critics.
XAttention: Block Sparse Attention with Antidiagonal Scoring
·2960 words·14 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Tsinghua University
XAttention: Antidiagonal scoring unlocks block-sparse attention, slashing compute costs in long-context Transformers without sacrificing accuracy.
Survey on Evaluation of LLM-based Agents
·396 words·2 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Hebrew University of Jerusalem
A comprehensive survey on evaluation methodologies for LLM-based agents, analyzing benchmarks and frameworks across key dimensions like capabilities, applications, and generalist performance.
Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models
·3774 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Rice University
LLMs survey: Model, output, and prompt-based strategies for efficient reasoning, mitigating ‘overthinking’ for faster, cheaper, and real-world applications.
MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion
·2769 words·13 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Renmin University of China
MathFusion: Instruction Fusion enhances LLM’s math problem-solving!
CaKE: Circuit-aware Editing Enables Generalizable Knowledge Learners
·3734 words·18 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 University of California, Los Angeles
CaKE: Editing LLMs to Enhance Knowledge Generalization Across Reasoning Tasks.
BigO(Bench) -- Can LLMs Generate Code with Controlled Time and Space Complexity?
·3082 words·15 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 FAIR at Meta
BIGO(Bench) can help LLMs generate code with controlled time/space complexity, addressing the gap in current evaluations and encouraging further exploration.
Temporal Consistency for LLM Reasoning Process Error Identification
·3234 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Princeton University
A new test-time method, Temporal Consistency, is introduced to improve LLM reasoning by leveraging iterative self-reflection.
Pensez: Less Data, Better Reasoning -- Rethinking French LLM
·3508 words·17 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 UniversitΓ© Grenoble Alpes
Pensez: Strategic fine-tuning beats massive data for superior reasoning in French LLMs, challenging conventional wisdom.
$Ο†$-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation
·3341 words·16 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Shanghai AI Lab
Ξ¦-Decoding: Adaptive foresight sampling balances inference-time exploration and exploitation for better LLM reasoning.
Investigating Human-Aligned Large Language Model Uncertainty
·1326 words·7 mins· loading · loading
AI Generated πŸ€— Daily Papers Natural Language Processing Large Language Models 🏒 Vanderbilt University
This research explores how well LLM uncertainty measures align with human uncertainty, finding Bayesian and top-k entropy measures show promise.