Skip to main content

2025-01-31s

2025

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training
·3471 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NYU
WILDCHAT-50M: Largest public chat dataset refines LLM post-training, showing superior SFT performance with fewer samples.
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
·2085 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tencent AI Lab
Large language models (LLMs) often prematurely abandon promising reasoning paths, a phenomenon called ‘underthinking’. This paper introduces a novel metric to quantify this issue and proposes a decodi…
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch
·5509 words·26 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 Google DeepMind
Streaming DiLoCo achieves two orders of magnitude bandwidth reduction in billion-scale parameter LLM training by synchronizing parameter subsets sequentially, overlapping communication with computatio…
o3-mini vs DeepSeek-R1: Which One is Safer?
·578 words·3 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Safety 🏢 Mondragon University
ASTRAL, a novel automated safety testing tool, reveals DeepSeek-R1’s significantly higher unsafe response rate compared to OpenAI’s o3-mini, highlighting critical safety concerns in advanced LLMs.
GuardReasoner: Towards Reasoning-based LLM Safeguards
·5624 words·27 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore
GuardReasoner enhances LLM safety with reasoning-based guardrails, improving performance, explainability, and generalization on various benchmarks.
Large Language Models Think Too Fast To Explore Effectively
·3497 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology
Large language models underperform humans in open-ended exploration due to prioritizing immediate choices over long-term strategic thinking, but innovative models show promise.