Oral Reinforcement Learnings

The Sample-Communication Complexity Trade-off in Federated Q-Learning

26 September 2024·1654 words·8 mins· loading · loading

Reinforcement Learning 🏢 Carnegie Mellon University

Federated Q-learning achieves optimal sample & communication complexities simultaneously via Fed-DVR-Q, a novel algorithm.

Statistical Efficiency of Distributional Temporal Difference Learning

26 September 2024·295 words·2 mins· loading · loading

Reinforcement Learning 🏢 Peking University

Researchers achieve minimax optimal sample complexity bounds for distributional temporal difference learning, enhancing reinforcement learning algorithm efficiency.

Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs

26 September 2024·1956 words·10 mins· loading · loading

Reinforcement Learning 🏢 University of Wisconsin-Madison

This paper achieves minimax-optimal bounds for learning near-optimal policies in average-reward MDPs, addressing a long-standing open problem in reinforcement learning.

Reinforcement Learning Under Latent Dynamics: Toward Statistical and Algorithmic Modularity

26 September 2024·330 words·2 mins· loading · loading

Reinforcement Learning 🏢 University of Michigan

This paper pioneers a modular framework for reinforcement learning, addressing the challenge of learning under complex observations and simpler latent dynamics, offering both statistical and algorithm…

Maximum Entropy Inverse Reinforcement Learning of Diffusion Models with Energy-Based Models

26 September 2024·2261 words·11 mins· loading · loading

Reinforcement Learning 🏢 Korea Institute for Advanced Study

Boosting diffusion model sample quality, especially with few steps, is achieved via a novel maximum entropy inverse reinforcement learning approach, jointly training the model and an energy-based mode…

Learning Formal Mathematics From Intrinsic Motivation

26 September 2024·1732 words·9 mins· loading · loading

Reinforcement Learning 🏢 Stanford University

AI agent MINIMO learns to generate challenging mathematical conjectures and prove them, bootstrapping from axioms alone and self-improving in both conjecture generation and theorem proving.

Improving Environment Novelty Quantification for Effective Unsupervised Environment Design

26 September 2024·2893 words·14 mins· loading · loading

Reinforcement Learning 🏢 Singapore Management University

Boosting AI generalization: CENIE framework quantifies environment novelty via state-action coverage, enhancing unsupervised environment design for robust generalization.