Skip to main content

Reinforcement Learning

Hierarchical Programmatic Option Framework
·5774 words·28 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 National Taiwan University
Hierarchical Programmatic Option framework (HIPO) uses human-readable programs as options in reinforcement learning to solve long, repetitive tasks with improved interpretability and generalization.
GUIDE: Real-Time Human-Shaped Agents
·2015 words·10 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Duke University
GUIDE: Real-time human-shaped AI agents achieve up to 30% higher success rates using continuous human feedback, boosted by a parallel training model that mimics human input for continued improvement.
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
·3982 words·19 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 KAIST
Generative Trajectory Augmentation (GTA) significantly boosts offline reinforcement learning by generating high-reward trajectories using a conditional diffusion model, enhancing algorithm performance…
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
·2428 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 National Key Laboratory of Human-Machine Hybrid Augmented Intelligence
Generative world models enhance multi-agent decision-making by simulating trial-and-error learning, improving answer accuracy and explainability.
Graph Diffusion Policy Optimization
·2821 words·14 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 Zhejiang University
GDPO: A novel method optimizes graph diffusion models for any objective using reinforcement learning, achieving state-of-the-art performance in diverse graph generation tasks.
Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
·3576 words·17 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 National Taiwan University
HEPO, a novel constrained optimization method, consistently surpasses heuristic-trained policies in reinforcement learning by ensuring policy improvement over heuristics, regardless of heuristic quali…
Goal-Conditioned On-Policy Reinforcement Learning
·2363 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 National University of Defense Technology
GCPO: a novel on-policy goal-conditioned reinforcement learning framework tackles limitations of existing HER-based methods by effectively addressing multi-goal Markovian and non-Markovian reward prob…
Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning
·1872 words·9 mins· loading · loading
Reinforcement Learning 🏒 Indiana University Bloomington
Goal Reduction with Loop-Removal accelerates Reinforcement Learning (RL) and accurately models human brain activity during goal-directed learning by efficiently deriving subgoals from distant original…
Global Rewards in Restless Multi-Armed Bandits
·2031 words·10 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Carnegie Mellon University
Restless multi-armed bandits with global rewards (RMAB-G) are introduced, extending the model to handle non-separable rewards and offering novel index-based and adaptive policies that outperform exist…
Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search
·4163 words·20 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 Aalto University
LLMs guided by Monte Carlo Tree Search generate precise, efficient Python code as world models for model-based reinforcement learning, significantly improving sample efficiency and inference speed.
Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization
·3205 words·16 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 Institute of Automation, Chinese Academy of Sciences
CP3ER, a novel consistency policy with prioritized proximal experience regularization, significantly boosts sample efficiency and stability in visual reinforcement learning, achieving state-of-the-art…
Generalized Linear Bandits with Limited Adaptivity
·341 words·2 mins· loading · loading
Reinforcement Learning 🏒 Stanford University
This paper introduces two novel algorithms, achieving optimal regret in generalized linear contextual bandits despite limited policy updates, a significant advancement for real-world applications.
Gaussian Process Bandits for Top-k Recommendations
·1799 words·9 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 University of Massachusetts Amherst
GP-TopK: A novel contextual bandit algorithm uses Gaussian processes with a Kendall kernel for efficient & accurate top-k recommendations, even with limited feedback.
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
·1382 words·7 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 HSE University
This paper delivers non-asymptotic accuracy bounds for confidence intervals in linear stochastic approximation, leveraging a novel multiplier bootstrap method.
Functional Bilevel Optimization for Machine Learning
·1884 words·9 mins· loading · loading
Reinforcement Learning 🏒 University of Grenoble Alpes
Functional Bilevel Optimization tackles the ambiguity of using neural networks in bilevel optimization by minimizing the inner objective over a function space, leading to scalable & efficient algorith…
From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning
·3972 words·19 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 Beihang University
TTCT translates natural language constraints into effective training signals for safe reinforcement learning, enabling agents to learn safer policies with lower violation rates and zero-shot transfer …
Foundations of Multivariate Distributional Reinforcement Learning
·1558 words·8 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Google DeepMind
First oracle-free, computationally tractable algorithms for provably convergent multivariate distributional RL are introduced, achieving convergence rates matching scalar settings and offering insight…
Focus On What Matters: Separated Models For Visual-Based RL Generalization
·2557 words·13 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 Department of Computer Science, Tongji University
SMG (Separated Models for Generalization) enhances visual RL generalization by disentangling task-relevant and irrelevant visual features via cooperative reconstruction, achieving state-of-the-art per…
FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation
·3516 words·17 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
FlexPlanner: Deep reinforcement learning solves flexible 3D floorplanning, improving wirelength and alignment significantly.
Fixed Confidence Best Arm Identification in the Bayesian Setting
·1424 words·7 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 UniversitÑ Degli Studi Di Milano
Bayesian best-arm identification algorithm achieves near-optimal sample complexity by incorporating an early-stopping criterion.