Reinforcement Learning
Hierarchical Programmatic Option Framework
·5774 words·28 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ National Taiwan University
Hierarchical Programmatic Option framework (HIPO) uses human-readable programs as options in reinforcement learning to solve long, repetitive tasks with improved interpretability and generalization.
GUIDE: Real-Time Human-Shaped Agents
·2015 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Duke University
GUIDE: Real-time human-shaped AI agents achieve up to 30% higher success rates using continuous human feedback, boosted by a parallel training model that mimics human input for continued improvement.
GTA: Generative Trajectory Augmentation with Guidance for Offline Reinforcement Learning
·3982 words·19 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ KAIST
Generative Trajectory Augmentation (GTA) significantly boosts offline reinforcement learning by generating high-reward trajectories using a conditional diffusion model, enhancing algorithm performance…
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
·2428 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ National Key Laboratory of Human-Machine Hybrid Augmented Intelligence
Generative world models enhance multi-agent decision-making by simulating trial-and-error learning, improving answer accuracy and explainability.
Graph Diffusion Policy Optimization
·2821 words·14 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Zhejiang University
GDPO: A novel method optimizes graph diffusion models for any objective using reinforcement learning, achieving state-of-the-art performance in diverse graph generation tasks.
Going Beyond Heuristics by Imposing Policy Improvement as a Constraint
·3576 words·17 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ National Taiwan University
HEPO, a novel constrained optimization method, consistently surpasses heuristic-trained policies in reinforcement learning by ensuring policy improvement over heuristics, regardless of heuristic quali…
Goal-Conditioned On-Policy Reinforcement Learning
·2363 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ National University of Defense Technology
GCPO: a novel on-policy goal-conditioned reinforcement learning framework tackles limitations of existing HER-based methods by effectively addressing multi-goal Markovian and non-Markovian reward prob…
Goal Reduction with Loop-Removal Accelerates RL and Models Human Brain Activity in Goal-Directed Learning
·1872 words·9 mins·
loading
·
loading
Reinforcement Learning
π’ Indiana University Bloomington
Goal Reduction with Loop-Removal accelerates Reinforcement Learning (RL) and accurately models human brain activity during goal-directed learning by efficiently deriving subgoals from distant original…
Global Rewards in Restless Multi-Armed Bandits
·2031 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Carnegie Mellon University
Restless multi-armed bandits with global rewards (RMAB-G) are introduced, extending the model to handle non-separable rewards and offering novel index-based and adaptive policies that outperform exist…
Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search
·4163 words·20 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Aalto University
LLMs guided by Monte Carlo Tree Search generate precise, efficient Python code as world models for model-based reinforcement learning, significantly improving sample efficiency and inference speed.
Generalizing Consistency Policy to Visual RL with Prioritized Proximal Experience Regularization
·3205 words·16 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Institute of Automation, Chinese Academy of Sciences
CP3ER, a novel consistency policy with prioritized proximal experience regularization, significantly boosts sample efficiency and stability in visual reinforcement learning, achieving state-of-the-art…
Generalized Linear Bandits with Limited Adaptivity
·341 words·2 mins·
loading
·
loading
Reinforcement Learning
π’ Stanford University
This paper introduces two novel algorithms, achieving optimal regret in generalized linear contextual bandits despite limited policy updates, a significant advancement for real-world applications.
Gaussian Process Bandits for Top-k Recommendations
·1799 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ University of Massachusetts Amherst
GP-TopK: A novel contextual bandit algorithm uses Gaussian processes with a Kendall kernel for efficient & accurate top-k recommendations, even with limited feedback.
Gaussian Approximation and Multiplier Bootstrap for Polyak-Ruppert Averaged Linear Stochastic Approximation with Applications to TD Learning
·1382 words·7 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ HSE University
This paper delivers non-asymptotic accuracy bounds for confidence intervals in linear stochastic approximation, leveraging a novel multiplier bootstrap method.
Functional Bilevel Optimization for Machine Learning
·1884 words·9 mins·
loading
·
loading
Reinforcement Learning
π’ University of Grenoble Alpes
Functional Bilevel Optimization tackles the ambiguity of using neural networks in bilevel optimization by minimizing the inner objective over a function space, leading to scalable & efficient algorith…
From Text to Trajectory: Exploring Complex Constraint Representation and Decomposition in Safe Reinforcement Learning
·3972 words·19 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Beihang University
TTCT translates natural language constraints into effective training signals for safe reinforcement learning, enabling agents to learn safer policies with lower violation rates and zero-shot transfer …
Foundations of Multivariate Distributional Reinforcement Learning
·1558 words·8 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Google DeepMind
First oracle-free, computationally tractable algorithms for provably convergent multivariate distributional RL are introduced, achieving convergence rates matching scalar settings and offering insight…
Focus On What Matters: Separated Models For Visual-Based RL Generalization
·2557 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Department of Computer Science, Tongji University
SMG (Separated Models for Generalization) enhances visual RL generalization by disentangling task-relevant and irrelevant visual features via cooperative reconstruction, achieving state-of-the-art per…
FlexPlanner: Flexible 3D Floorplanning via Deep Reinforcement Learning in Hybrid Action Space with Multi-Modality Representation
·3516 words·17 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
FlexPlanner: Deep reinforcement learning solves flexible 3D floorplanning, improving wirelength and alignment significantly.
Fixed Confidence Best Arm Identification in the Bayesian Setting
·1424 words·7 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ UniversitΓ‘ Degli Studi Di Milano
Bayesian best-arm identification algorithm achieves near-optimal sample complexity by incorporating an early-stopping criterion.