Machine Learning

S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning

18 February 2025·3894 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Tencent

S2R: Teaches LLMs to self-verify and self-correct, boosting reasoning with efficient reinforcement learning.

NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation

18 February 2025·6586 words·31 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 National University of Singapore

NExT-Mol: Combines 1D language models with 3D diffusion for molecule generation, achieving state-of-the-art performance and validity.

Eager Updates For Overlapped Communication and Computation in DiLoCo

18 February 2025·3815 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 Google DeepMind

Eager updates drastically speed up training massive language models by cleverly overlapping communication and computation in DiLoCo, achieving near-optimal performance even with low bandwidth.

Thinking Preference Optimization

17 February 2025·5794 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Case.edu

ThinkPO improves LLM reasoning by preferring longer CoT, boosting performance without new data.

Small Models Struggle to Learn from Strong Reasoners

17 February 2025·4149 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Washington

Small language models struggle to learn complex reasoning from large models, but a novel ‘Mix Distillation’ method balances complexity for effective capability transfer.

Towards Data-Efficient Pretraining for Atomic Property Prediction

16 February 2025·3694 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Transfer Learning 🏢 King Abdullah University of Science and Technology

High-quality, task-relevant pretraining data surpasses large-scale pretraining in atomic property prediction, achieving comparable performance at 1/24th the computational cost.

Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning

14 February 2025·4399 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI

MIKASA, a new benchmark for memory-intensive reinforcement learning, provides a unified framework for evaluating memory capabilities in diverse scenarios, including complex robotic manipulation tasks.

AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting

14 February 2025·3650 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Huawei Noah's Ark Lab, Paris, France

AdaPTS effectively adapts pre-trained univariate time series models to probabilistic multivariate forecasting, improving accuracy and uncertainty quantification.

Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights

13 February 2025·3096 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 School of Computer Science and Engineering

ProbeLog: Zero-shot model search directly from weights, boosting efficiency and accuracy!

Agency Is Frame-Dependent

6 February 2025·400 words·2 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Google DeepMind

Agency, a key concept in AI, is shown to be relative to the observer’s perspective (frame-dependent), challenging traditional binary definitions and necessitating a more nuanced approach for AI system…

Improving Transformer World Models for Data-Efficient RL

3 February 2025·2775 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Google DeepMind

AI agents now master complex tasks with improved Transformer World Models, achieving a new state-of-the-art in data-efficient reinforcement learning.

ACECODER: Acing Coder RL via Automated Test-Case Synthesis

3 February 2025·3269 words·16 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Waterloo

AceCoder uses automated test-case synthesis to create a large-scale dataset for training reward models, enabling effective reinforcement learning to significantly boost code generation model performan…

Weak-to-Strong Diffusion with Reflection

1 February 2025·4655 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Hong Kong University of Science and Technology

W2SD: A novel framework boosts diffusion model quality by using the difference between weak and strong models to refine sampling trajectories, achieving state-of-the-art performance.

Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch

30 January 2025·5509 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 Google DeepMind

Streaming DiLoCo achieves two orders of magnitude bandwidth reduction in billion-scale parameter LLM training by synchronizing parameter subsets sequentially, overlapping communication with computatio…

SRMT: Shared Memory for Multi-agent Lifelong Pathfinding

22 January 2025·3632 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI

SRMT: Shared Recurrent Memory Transformer boosts multi-agent coordination by implicitly sharing information via a global memory, significantly outperforming baselines in complex pathfinding tasks.

Graph Generative Pre-trained Transformer

2 January 2025·3057 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Graph Representation Learning 🏢 Tufts University

G2PT: a novel graph generative model using sequence-based representation and transformer decoder, achieving superior performance on diverse tasks.

Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning

16 December 2024·2945 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 MIPT

Simple tweak, big privacy win: MLP-based architectures boost data protection in federated learning.

A New Federated Learning Framework Against Gradient Inversion Attacks

10 December 2024·2925 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 School of Computing and Data Science, University of Hong Kong

HyperFL: A new federated learning framework breaking the direct connection between shared parameters and private data, effectively defending against gradient inversion attacks while maintaining favora…

PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations

8 December 2024·5378 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Department of Artificial Intelligence, Sungkyunkwan University

Physics-Informed Gaussians (PIGs) revolutionize PDE solving by using adaptive, learnable Gaussian functions for superior accuracy and efficiency.

Best of Both Worlds: Advantages of Hybrid Graph Sequence Models

23 November 2024·3440 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Google Research

Hybrid Graph Sequence Model (GSM++) outperforms existing models by using hierarchical sequences and a hybrid architecture of Transformers and recurrent models, effectively capturing both local and glo…