2025-01-31s
2025
WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training
·3471 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 NYU
WILDCHAT-50M: Largest public chat dataset refines LLM post-training, showing superior SFT performance with fewer samples.
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
·2085 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tencent AI Lab
Large language models (LLMs) often prematurely abandon promising reasoning paths, a phenomenon called ‘underthinking’. This paper introduces a novel metric to quantify this issue and proposes a decodi…
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch
·5509 words·26 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Machine Learning
Federated Learning
🏢 Google DeepMind
Streaming DiLoCo achieves two orders of magnitude bandwidth reduction in billion-scale parameter LLM training by synchronizing parameter subsets sequentially, overlapping communication with computatio…
o3-mini vs DeepSeek-R1: Which One is Safer?
·578 words·3 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Theory
Safety
🏢 Mondragon University
ASTRAL, a novel automated safety testing tool, reveals DeepSeek-R1’s significantly higher unsafe response rate compared to OpenAI’s o3-mini, highlighting critical safety concerns in advanced LLMs.
GuardReasoner: Towards Reasoning-based LLM Safeguards
·5624 words·27 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 National University of Singapore
GuardReasoner enhances LLM safety with reasoning-based guardrails, improving performance, explainability, and generalization on various benchmarks.
Large Language Models Think Too Fast To Explore Effectively
·3497 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Georgia Institute of Technology
Large language models underperform humans in open-ended exploration due to prioritizing immediate choices over long-term strategic thinking, but innovative models show promise.