Skip to main content

Reinforcement Learning

MLGym: A New Framework and Benchmark for Advancing AI Research Agents
·1911 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 UC Santa Barbara
MLGYM: A new framework & benchmark to advance AI Research Agents
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning
·3688 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Microsoft Research Asia
Logic-RL unlocks LLM reasoning via rule-based reinforcement learning, generalizing to math problems after training on logic puzzles.
AdaptiveStep: Automatically Dividing Reasoning Step through Model Confidence
·4758 words·23 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Nanjing University
AdaptiveStep: Divides reasoning steps automatically through model confidence, enhancing PRM training & performance.
S$^2$R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning
·3894 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Tencent
S2R: Teaches LLMs to self-verify and self-correct, boosting reasoning with efficient reinforcement learning.
Memory, Benchmark & Robots: A Benchmark for Solving Complex Tasks with Reinforcement Learning
·4399 words·21 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI
MIKASA, a new benchmark for memory-intensive reinforcement learning, provides a unified framework for evaluating memory capabilities in diverse scenarios, including complex robotic manipulation tasks.
Agency Is Frame-Dependent
·400 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Google DeepMind
Agency, a key concept in AI, is shown to be relative to the observer’s perspective (frame-dependent), challenging traditional binary definitions and necessitating a more nuanced approach for AI system…
Improving Transformer World Models for Data-Efficient RL
·2775 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 Google DeepMind
AI agents now master complex tasks with improved Transformer World Models, achieving a new state-of-the-art in data-efficient reinforcement learning.
ACECODER: Acing Coder RL via Automated Test-Case Synthesis
·3269 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 University of Waterloo
AceCoder uses automated test-case synthesis to create a large-scale dataset for training reward models, enabling effective reinforcement learning to significantly boost code generation model performan…
SRMT: Shared Memory for Multi-agent Lifelong Pathfinding
·3632 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Reinforcement Learning 🏢 AIRI
SRMT: Shared Recurrent Memory Transformer boosts multi-agent coordination by implicitly sharing information via a global memory, significantly outperforming baselines in complex pathfinding tasks.