Skip to main content

Machine Learning

S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
·3894 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Tencent
S2R: Teaches LLMs to self-verify and self-correct, boosting reasoning with efficient reinforcement learning.
NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation
·6586 words·31 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 National University of Singapore
NExT-Mol: Combines 1D language models with 3D diffusion for molecule generation, achieving state-of-the-art performance and validity.
Eager Updates For Overlapped Communication and Computation in DiLoCo
·3815 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 Google DeepMind
Eager updates drastically speed up training massive language models by cleverly overlapping communication and computation in DiLoCo, achieving near-optimal performance even with low bandwidth.
Thinking Preference Optimization
·5794 words·28 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Case.edu
ThinkPO improves LLM reasoning by preferring longer CoT, boosting performance without new data.
Small Models Struggle to Learn from Strong Reasoners
·4149 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 University of Washington
Small language models struggle to learn complex reasoning from large models, but a novel ‘Mix Distillation’ method balances complexity for effective capability transfer.
Towards Data-Efficient Pretraining for Atomic Property Prediction
·3694 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Transfer Learning 🏢 King Abdullah University of Science and Technology
High-quality, task-relevant pretraining data surpasses large-scale pretraining in atomic property prediction, achieving comparable performance at 1/24th the computational cost.
Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning
·4399 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI
MIKASA, a new benchmark for memory-intensive reinforcement learning, provides a unified framework for evaluating memory capabilities in diverse scenarios, including complex robotic manipulation tasks.
AdaPTS: Adapting Univariate Foundation Models to Probabilistic Multivariate Time Series Forecasting
·3650 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Huawei Noah's Ark Lab, Paris, France
AdaPTS effectively adapts pre-trained univariate time series models to probabilistic multivariate forecasting, improving accuracy and uncertainty quantification.
Can this Model Also Recognize Dogs? Zero-Shot Model Search from Weights
·3096 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 School of Computer Science and Engineering
ProbeLog: Zero-shot model search directly from weights, boosting efficiency and accuracy!
Agency Is Frame-Dependent
·400 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Google DeepMind
Agency, a key concept in AI, is shown to be relative to the observer’s perspective (frame-dependent), challenging traditional binary definitions and necessitating a more nuanced approach for AI system…
Improving Transformer World Models for Data-Efficient RL
·2775 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Google DeepMind
AI agents now master complex tasks with improved Transformer World Models, achieving a new state-of-the-art in data-efficient reinforcement learning.
ACECODER: Acing Coder RL via Automated Test-Case Synthesis
·3269 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Waterloo
AceCoder uses automated test-case synthesis to create a large-scale dataset for training reward models, enabling effective reinforcement learning to significantly boost code generation model performan…
Weak-to-Strong Diffusion with Reflection
·4655 words·22 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Hong Kong University of Science and Technology
W2SD: A novel framework boosts diffusion model quality by using the difference between weak and strong models to refine sampling trajectories, achieving state-of-the-art performance.
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch
·5509 words·26 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 Google DeepMind
Streaming DiLoCo achieves two orders of magnitude bandwidth reduction in billion-scale parameter LLM training by synchronizing parameter subsets sequentially, overlapping communication with computatio…
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
·3632 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI
SRMT: Shared Recurrent Memory Transformer boosts multi-agent coordination by implicitly sharing information via a global memory, significantly outperforming baselines in complex pathfinding tasks.
Graph Generative Pre-trained Transformer
·3057 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Graph Representation Learning 🏢 Tufts University
G2PT: a novel graph generative model using sequence-based representation and transformer decoder, achieving superior performance on diverse tasks.
Just a Simple Transformation is Enough for Data Protection in Vertical Federated Learning
·2945 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 MIPT
Simple tweak, big privacy win: MLP-based architectures boost data protection in federated learning.
A New Federated Learning Framework Against Gradient Inversion Attacks
·2925 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Federated Learning 🏢 School of Computing and Data Science, University of Hong Kong
HyperFL: A new federated learning framework breaking the direct connection between shared parameters and private data, effectively defending against gradient inversion attacks while maintaining favora…
PIG: Physics-Informed Gaussians as Adaptive Parametric Mesh Representations
·5378 words·26 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Department of Artificial Intelligence, Sungkyunkwan University
Physics-Informed Gaussians (PIGs) revolutionize PDE solving by using adaptive, learnable Gaussian functions for superior accuracy and efficiency.
Best of Both Worlds: Advantages of Hybrid Graph Sequence Models
·3440 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Google Research
Hybrid Graph Sequence Model (GSM++) outperforms existing models by using hierarchical sequences and a hybrid architecture of Transformers and recurrent models, effectively capturing both local and glo…