Reinforcement Learning
Local and Adaptive Mirror Descents in Extensive-Form Games
·321 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 CREST - FairPlay, ENSAE Paris
LocalOMD: Adaptive OMD in extensive-form games achieves near-optimal sample complexity by using fixed sampling and local updates, reducing variance and generalizing well.
Leveraging Separated World Model for Exploration in Visually Distracted Environments
·2320 words·11 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 School of Artificial Intelligence, Nanjing University, China
SeeX, a novel bi-level optimization framework, effectively tackles the challenge of exploration in visually cluttered environments by training a separated world model to extract relevant information a…
Learning World Models for Unconstrained Goal Navigation
·2782 words·14 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Rutgers University
MUN: a novel goal-directed exploration algorithm significantly improves world model reliability and policy generalization in sparse-reward goal-conditioned RL, enabling efficient navigation across div…
Learning Versatile Skills with Curriculum Masking
·2688 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Shanghai Jiao Tong University
CurrMask: a novel curriculum masking paradigm for offline RL, achieving superior zero-shot and fine-tuning performance by dynamically adjusting masking schemes during pretraining, enabling versatile s…
Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games
·2604 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Peking University
AI agents learn to balance helpfulness and self-preservation using empathy to gauge social relationships and guide reward sharing.
Learning the Optimal Policy for Balancing Short-Term and Long-Term Rewards
·1775 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 ByteDance Research
A novel Decomposition-based Policy Learning (DPPL) method optimally balances short-term and long-term rewards, even with interrelated objectives, by transforming the problem into intuitive subproblems…
Learning Successor Features the Simple Way
·9069 words·43 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Google DeepMind
Learn deep Successor Features (SFs) directly from pixels, efficiently and without representation collapse, using a novel, simple method combining TD and reward prediction loss!
Learning Multimodal Behaviors from Scratch with Diffusion Policy Gradient
·3397 words·16 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 MIT
DDiffPG: A novel actor-critic algorithm learns multimodal policies from scratch using diffusion models, enabling agents to master versatile behaviors in complex tasks.
Learning in Markov Games with Adaptive Adversaries: Policy Regret, Fundamental Barriers, and Efficient Algorithms
·349 words·2 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Johns Hopkins University
Learning against adaptive adversaries in Markov games is hard, but this paper shows how to achieve low policy regret with efficient algorithms by introducing a new notion of consistent adaptive advers…
Learning General Parameterized Policies for Infinite Horizon Average Reward Constrained MDPs via Primal-Dual Policy Gradient Algorithm
·327 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Purdue University
First-ever sublinear regret & constraint violation bounds achieved for infinite horizon average reward CMDPs with general policy parametrization using a novel primal-dual policy gradient algorithm.
Learning Formal Mathematics From Intrinsic Motivation
·1732 words·9 mins·
loading
·
loading
Reinforcement Learning
🏢 Stanford University
AI agent MINIMO learns to generate challenging mathematical conjectures and prove them, bootstrapping from axioms alone and self-improving in both conjecture generation and theorem proving.
Learning Equilibria in Adversarial Team Markov Games: A Nonconvex-Hidden-Concave Min-Max Optimization Problem
·338 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 UC Irvine
AI agents efficiently learn Nash equilibria in adversarial team Markov games using a novel learning algorithm with polynomial complexity, resolving prior limitations.
Learning Distinguishable Trajectory Representation with Contrastive Loss
·2347 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Nanjing University of Aeronautics and Astronautics
Contrastive Trajectory Representation (CTR) boosts multi-agent reinforcement learning by learning distinguishable agent trajectories using contrastive loss, thus improving performance significantly.
Latent Plan Transformer for Trajectory Abstraction: Planning as Latent Space Inference
·2609 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 UC Los Angeles
Latent Plan Transformer (LPT) solves long-term planning challenges in reinforcement learning by using latent variables to connect trajectory generation with final returns, achieving competitive result…
Latent Learning Progress Drives Autonomous Goal Selection in Human Reinforcement Learning
·2787 words·14 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 UC Berkeley
Humans autonomously select goals based on both observed and latent learning progress, impacting goal-conditioned policy learning.
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
·2971 words·14 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
New CRL algorithms guarantee global convergence, handle multiple constraints and various risk measures, improving safety and robustness in AI.
Kernel-Based Function Approximation for Average Reward Reinforcement Learning: An Optimist No-Regret Algorithm
·311 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 MediaTek Research
Novel optimistic RL algorithm using kernel methods achieves no-regret performance in the challenging infinite-horizon average-reward setting.
KALM: Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts
·3188 words·15 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 National Key Laboratory for Novel Software Technology, Nanjing University, China
KALM: Knowledgeable agents learn complex tasks from LLMs via offline RL using imaginary rollouts, significantly outperforming baselines.
Kaleidoscope: Learnable Masks for Heterogeneous Multi-agent Reinforcement Learning
·2281 words·11 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Hong Kong University of Science and Technology
Kaleidoscope: Learnable Masks for Heterogeneous MARL achieves high sample efficiency and policy diversity by using learnable masks for adaptive partial parameter sharing.
Iteratively Refined Behavior Regularization for Offline Reinforcement Learning
·2346 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Shanxi University
Iteratively Refined Behavior Regularization boosts offline reinforcement learning by iteratively refining the reference policy, ensuring robust and effective control policy learning.