2025-01-31s

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

30 January 2025·3471 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NYU

WILDCHAT-50M: Largest public chat dataset refines LLM post-training, showing superior SFT performance with fewer samples.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

30 January 2025·2085 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tencent AI Lab

Large language models (LLMs) often prematurely abandon promising reasoning paths, a phenomenon called ‘underthinking’. This paper introduces a novel metric to quantify this issue and proposes a decodi…

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

30 January 2025·5509 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 Google DeepMind

Streaming DiLoCo achieves two orders of magnitude bandwidth reduction in billion-scale parameter LLM training by synchronizing parameter subsets sequentially, overlapping communication with computatio…

o3-mini vs DeepSeek-R1: Which One is Safer?

30 January 2025·578 words·3 mins· loading · loading

AI Generated 🤗 Daily Papers AI Theory Safety 🏢 Mondragon University

ASTRAL, a novel automated safety testing tool, reveals DeepSeek-R1’s significantly higher unsafe response rate compared to OpenAI’s o3-mini, highlighting critical safety concerns in advanced LLMs.

GuardReasoner: Towards Reasoning-based LLM Safeguards

30 January 2025·5624 words·27 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

GuardReasoner enhances LLM safety with reasoning-based guardrails, improving performance, explainability, and generalization on various benchmarks.

Large Language Models Think Too Fast To Explore Effectively

29 January 2025·3497 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology

Large language models underperform humans in open-ended exploration due to prioritizing immediate choices over long-term strategic thinking, but innovative models show promise.