Natural Language Processing
URSA: Understanding and Verifying Chain-of-thought Reasoning in Multimodal Mathematics
·5517 words·26 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Tsinghua University
URSA-7B: A new multimodal model significantly improves chain-of-thought reasoning in mathematics!
rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking
·3910 words·19 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Microsoft Research
Small language models can master complex math reasoning using self-evolved deep thinking via Monte Carlo Tree Search, surpassing larger models in performance.
LLM4SR: A Survey on Large Language Models for Scientific Research
·2870 words·14 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Texas at Dallas
LLMs revolutionize scientific research! This survey reveals their transformative potential across hypothesis discovery, experiment planning, writing, and peer review, guiding future research.
EpiCoder: Encompassing Diversity and Complexity in Code Generation
·5051 words·24 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Tsinghua University
EpiCoder revolutionizes code generation by using feature trees to create diverse and complex training data, resulting in state-of-the-art performance on various benchmarks.
Building Foundations for Natural Language Processing of Historical Turkish: Resources and Models
·3036 words·15 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Named Entity Recognition
๐ข Boฤaziรงi University
First-ever resources (NER dataset, dependency treebank, and corpus) and models for historical Turkish NLP are introduced, significantly advancing research capabilities in this underexplored field.
PPTAgent: Generating and Evaluating Presentations Beyond Text-to-Slides
·3721 words·18 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Text Generation
๐ข Chinese Academy of Sciences
PPTAgent, a novel two-stage framework, significantly improves automatic presentation generation by leveraging an edit-based workflow and a new evaluation metric, outperforming existing end-to-end meth…
Entropy-Guided Attention for Private LLMs
·5203 words·25 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข New York University
Boosting private LLMs’ efficiency and security, this research introduces an entropy-guided attention mechanism and PI-friendly layer normalization to mitigate the overheads of nonlinear operations.
Samba-asr state-of-the-art speech recognition leveraging structured state-space models
·1451 words·7 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Speech Recognition
๐ข SandLogic Technologies Pvt Ltd
Samba-ASR, a novel speech recognition model using Mamba architecture, surpasses existing transformer models in accuracy and efficiency, setting a new benchmark for future ASR research.
GeAR: Generation Augmented Retrieval
·1952 words·10 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Question Answering
๐ข Microsoft Research
GeAR, a new retrieval model, boosts accuracy by combining document retrieval with fine-grained information generation, leading to better understanding and improved localization.
BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning
·2687 words·13 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Shanghai AI Laboratory
BoostStep enhances large language models’ mathematical abilities by refining single-step reasoning through a novel step-level in-context learning strategy, achieving significant improvements on variou…
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use
·3646 words·18 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข ByteDance
ToolHop: New benchmark dataset rigorously evaluates LLMs’ multi-hop tool use, revealing significant challenges and variations across different LLM families.
Test-time Computing: from System-1 Thinking to System-2 Thinking
·658 words·4 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Soochow University
Unlocking LLM potential: This paper surveys test-time computing, showing how it boosts reasoning abilities by shifting from reactive System-1 to deliberate System-2 thinking, paving the way for more p…
Scaling Laws for Floating Point Quantization Training
·6363 words·30 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Tencent AI Lab
New scaling laws for efficient floating-point quantization training in LLMs are presented, showing optimal bit allocation and critical data size.
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models
·1374 words·7 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข String
REINFORCE++, a novel RLHF algorithm, achieves superior training stability and computational efficiency compared to existing methods like PPO and GRPO, while maintaining comparable performance.
Personalized Graph-Based Retrieval for Large Language Models
·3633 words·18 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of California Santa Cruz
Personalized Graph-based Retrieval-Augmented Generation (PGraphRAG) significantly improves personalized text generation by leveraging user-centric knowledge graphs, especially in cold-start scenarios …
METAGENE-1: Metagenomic Foundation Model for Pandemic Monitoring
·3440 words·17 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข University of Southern California
METAGENE-1, a 7-billion parameter language model, achieves state-of-the-art results in pathogen detection and genomic embedding by leveraging a massive wastewater metagenomic dataset.
Auto-RT: Automatic Jailbreak Strategy Exploration for Red-Teaming Large Language Models
·3986 words·19 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Ant Group
AUTO-RT automates LLM vulnerability discovery by using reinforcement learning to optimize complex attack strategies, achieving faster detection and higher success rates than existing methods.
Dynamic Scaling of Unit Tests for Code Reward Modeling
·3208 words·16 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Tsinghua University
Boosting code generation accuracy with more unit tests! This research shows that increasing the number of unit tests used to evaluate code generated by LLMs significantly improves accuracy, especially…
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
·2397 words·12 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Alibaba Group
CODEELO benchmark uses CodeForces to fairly evaluate LLMs’ coding abilities, providing human-comparable Elo ratings and addressing limitations of existing benchmarks.
BoxingGym: Benchmarking Progress in Automated Experimental Design and Model Discovery
·4247 words·20 mins·
loading
·
loading
AI Generated
๐ค Daily Papers
Natural Language Processing
Large Language Models
๐ข Stanford University
BoxingGym: A new benchmark rigorously evaluates AI agents’ ability to design experiments and discover scientific models, revealing current LLMs’ limitations and highlighting fertile research avenues.