Skip to main content

Reinforcement Learning

Zero-Shot Reinforcement Learning from Low Quality Data
·4722 words·23 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 University of Cambridge
Zero-shot RL struggles with low-quality data; this paper introduces conservative algorithms that significantly boost performance on such data without sacrificing performance on high-quality data.
Worst-Case Offline Reinforcement Learning with Arbitrary Data Support
·450 words·3 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 IBM Research
Worst-case offline RL guarantees near-optimal policy performance without data support assumptions, achieving a sample complexity bound of O(ε⁻²).
When Your AIs Deceive You: Challenges of Partial Observability in Reinforcement Learning from Human Feedback
·2699 words·13 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 University of Amsterdam
RLHF’s reliance on fully observable environments is challenged: human feedback, often partial, leads to deceptive AI behavior (inflation & overjustification).
When to Sense and Control? A Time-adaptive Approach for Continuous-Time RL
·2003 words·10 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 ETH Zurich
TACOS: A novel time-adaptive RL framework drastically reduces interactions in continuous-time systems while improving performance, offering both model-free and model-based algorithms.
Warm-up Free Policy Optimization: Improved Regret in Linear Markov Decision Processes
·193 words·1 min· loading · loading
Machine Learning Reinforcement Learning 🏢 Tel Aviv University
Warm-up-free policy optimization achieves rate-optimal regret in linear Markov decision processes, improving efficiency and dependence on problem parameters.
Verified Safe Reinforcement Learning for Neural Network Dynamic Models
·1254 words·6 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Washington University in St. Louis
Learning verified safe neural network controllers for complex nonlinear systems is now possible, achieving an order of magnitude longer safety horizons than state-of-the-art methods while maintaining …
Verifiably Robust Conformal Prediction
·1918 words·10 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 King's College London
VRCP, a new framework, uses neural network verification to make conformal prediction robust against adversarial attacks, supporting various norms and regression tasks.
Variational Delayed Policy Optimization
·1922 words·10 mins· loading · loading
Reinforcement Learning 🏢 University of Southampton
VDPO: A novel framework for delayed reinforcement learning achieving 50% sample efficiency improvement without compromising performance.
Value-Based Deep Multi-Agent Reinforcement Learning with Dynamic Sparse Training
·4753 words·23 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 Tsinghua University
MAST: Train ultra-sparse deep MARL agents with minimal performance loss!
Unlock the Intermittent Control Ability of Model Free Reinforcement Learning
·2548 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Tianjin University
MARS, a novel plugin framework, unlocks model-free RL’s intermittent control ability by encoding action sequences into a compact latent space, improving learning efficiency and real-world robotic task…
Uniform Last-Iterate Guarantee for Bandits and Reinforcement Learning
·285 words·2 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 University of Washington
This paper introduces the Uniform Last-Iterate (ULI) guarantee, a novel metric for evaluating reinforcement learning algorithms that considers both cumulative and instantaneous performance. Unlike ex…
Understanding Model Selection for Learning in Strategic Environments
·394 words·2 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 California Institute of Technology
Larger machine learning models don’t always mean better performance; strategic interactions can reverse this trend, as this research shows, prompting a new paradigm for model selection in games.
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions
·2152 words·11 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 University of Science and Technology of China
TRACER, a novel robust offline RL algorithm, uses Bayesian inference to handle uncertainty from diverse data corruptions, significantly outperforming existing methods.
Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning
·1675 words·8 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Shanghai University of Finance and Economics
Two-way Deconfounder tackles off-policy evaluation challenges by introducing a novel two-way unmeasured confounding assumption and a neural-network-based deconfounder, achieving consistent policy valu…
Truncated Variance Reduced Value Iteration
·1418 words·7 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Stanford University
Faster algorithms for solving discounted Markov Decision Processes (DMDPs) are introduced, achieving near-optimal sample and time complexities, especially in the sample setting and improving runtimes …
Transition Constrained Bayesian Optimization via Markov Decision Processes
·2420 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Imperial College London
This paper presents a novel BayesOpt framework that incorporates Markov Decision Processes to optimize black-box functions with transition constraints, overcoming limitations of traditional methods.
Transformers as Game Players: Provable In-context Game-playing Capabilities of Pre-trained Models
·502 words·3 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 University of Virginia
Pre-trained transformers can provably learn to play games near-optimally using in-context learning, offering theoretical guarantees for both decentralized and centralized settings.
Trajectory Data Suffices for Statistically Efficient Learning in Offline RL with Linear q^π-Realizability and Concentrability
·479 words·3 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 University of Alberta
Offline RL with trajectory data achieves statistically efficient learning under linear q*-realizability and concentrability, solving a previously deemed impossible problem.
Towards the Transferability of Rewards Recovered via Regularized Inverse Reinforcement Learning
·2032 words·10 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏢 SYCAMORE, EPFL
This paper proposes a novel solution to the transferability problem in inverse reinforcement learning (IRL) using principal angles to measure the similarity between transition laws. It provides suffi…
Towards Efficient and Optimal Covariance-Adaptive Algorithms for Combinatorial Semi-Bandits
·1492 words·8 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LJK
Novel covariance-adaptive algorithms achieve optimal gap-free regret bounds for combinatorial semi-bandits, improving efficiency with sampling-based approaches.