Reinforcement Learning
Streaming Bayes GFlowNets
·2088 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Getulio Vargas Foundation
SB-GFlowNets: Streaming Bayesian inference is now efficient and accurate using GFlowNets, enabling real-time model updates for large, sequential datasets.
Strategic Multi-Armed Bandit Problems Under Debt-Free Reporting
·1460 words·7 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 CREST, ENSAE
Incentive-aware algorithm achieves low regret in strategic multi-armed bandits under debt-free reporting, establishing truthful equilibrium among arms.
Stochastic contextual bandits with graph feedback: from independence number to MAS number
·289 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 New York University
Contextual bandits with graph feedback achieve near-optimal regret by leveraging a novel graph-theoretic quantity that interpolates between independence and maximum acyclic subgraph numbers, depending…
Statistical Efficiency of Distributional Temporal Difference Learning
·295 words·2 mins·
loading
·
loading
Reinforcement Learning
🏢 Peking University
Researchers achieve minimax optimal sample complexity bounds for distributional temporal difference learning, enhancing reinforcement learning algorithm efficiency.
State-free Reinforcement Learning
·357 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Boston University
State-free Reinforcement Learning (SFRL) framework eliminates the need for state-space information in RL algorithms, achieving regret bounds independent of the state space size and adaptive to the rea…
State Chrono Representation for Enhancing Generalization in Reinforcement Learning
·2535 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of California, Santa Barbara
State Chrono Representation (SCR) enhances reinforcement learning generalization by incorporating extensive temporal information and cumulative rewards into state representations, improving performanc…
SPRINQL: Sub-optimal Demonstrations driven Offline Imitation Learning
·3010 words·15 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Singapore Management University
SPRINQL: Sub-optimal Demonstrations for Offline Imitation Learning
SPO: Sequential Monte Carlo Policy Optimisation
·3026 words·15 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Amsterdam
SPO: A novel model-based RL algorithm leverages parallelisable Monte Carlo tree search for efficient and robust policy improvement in both discrete and continuous environments.
Speculative Monte-Carlo Tree Search
·1943 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Pennsylvania State University
Speculative MCTS accelerates AlphaZero training by implementing speculative execution, enabling parallel processing of future moves and reducing latency by up to 5.8x.
Spectral-Risk Safe Reinforcement Learning with Convergence Guarantees
·2502 words·12 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Seoul National University
SRCPO: a novel spectral risk measure-constrained RL algorithm guaranteeing convergence to a global optimum, outperforming existing methods in continuous control tasks.
Sparsity-Agnostic Linear Bandits with Adaptive Adversaries
·336 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 National University of Singapore
SparseLinUCB: First sparse regret bounds for adversarial action sets with unknown sparsity, achieving superior performance over existing methods!
Span-Based Optimal Sample Complexity for Weakly Communicating and General Average Reward MDPs
·1956 words·10 mins·
loading
·
loading
Reinforcement Learning
🏢 University of Wisconsin-Madison
This paper achieves minimax-optimal bounds for learning near-optimal policies in average-reward MDPs, addressing a long-standing open problem in reinforcement learning.
Solving Zero-Sum Markov Games with Continous State via Spectral Dynamic Embedding
·391 words·2 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Zhejiang University
SDEPO, a new natural policy gradient algorithm, efficiently solves zero-sum Markov games with continuous state spaces, achieving near-optimal convergence independent of state space cardinality.
Solving Minimum-Cost Reach Avoid using Reinforcement Learning
·2253 words·11 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 MIT
RC-PPO: Reinforcement learning solves minimum-cost reach-avoid problems with up to 57% lower costs!
Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates
·447 words·3 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Google DeepMind
Stochastic gradient bandit algorithms now guaranteed to globally converge, using ANY constant learning rate!
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents
·2849 words·14 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Khoury College of Computer Sciences, Northeastern University
SleeperNets: A universal backdoor attack against RL agents, achieving 100% success rate across diverse environments while preserving benign performance.
Skill-aware Mutual Information Optimisation for Zero-shot Generalisation in Reinforcement Learning
·5509 words·26 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 University of Edinburgh
Skill-aware Mutual Information optimization enhances RL agent generalization across diverse tasks by distinguishing context embeddings based on skills, leading to improved zero-shot performance and ro…
SkiLD: Unsupervised Skill Discovery Guided by Factor Interactions
·2028 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Texas at Austin
SkiLD, a novel unsupervised skill discovery method, uses state factorization and a new objective function to learn skills inducing diverse interactions between state factors, outperforming existing me…
Simplifying Latent Dynamics with Softly State-Invariant World Models
·2423 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Max Planck Institute for Biological Cybernetics
This paper introduces the Parsimonious Latent Space Model (PLSM), a novel world model that regularizes latent dynamics to improve action predictability, enhancing RL performance.
Simplifying Constraint Inference with Inverse Reinforcement Learning
·1653 words·8 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 University of Toronto
This paper simplifies constraint inference in reinforcement learning, demonstrating that standard inverse RL methods can effectively infer constraints from expert data, surpassing complex, previously …