🏢 Politecnico Di Milano
Sub-optimal Experts mitigate Ambiguity in Inverse Reinforcement Learning
·2049 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
Sub-optimal expert data improves Inverse Reinforcement Learning by significantly reducing ambiguity in reward function estimation.
Optimal Multi-Fidelity Best-Arm Identification
·2446 words·12 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
A new algorithm for multi-fidelity best-arm identification achieves asymptotically optimal cost complexity, offering significant improvements over existing methods.
Online Bayesian Persuasion Without a Clue
·1780 words·9 mins·
loading
·
loading
AI Theory
Optimization
🏢 Politecnico Di Milano
Researchers developed a novel online Bayesian persuasion algorithm that achieves sublinear regret without prior knowledge of the receiver or the state distribution, providing tight theoretical guarant…
Local Linearity: the Key for No-regret Reinforcement Learning in Continuous MDPs
·451 words·3 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
CINDERELLA: a new algorithm achieves state-of-the-art no-regret bounds for continuous RL problems by exploiting local linearity.
Last-Iterate Global Convergence of Policy Gradients for Constrained Reinforcement Learning
·2971 words·14 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
New CRL algorithms guarantee global convergence, handle multiple constraints and various risk measures, improving safety and robustness in AI.
How does Inverse RL Scale to Large State Spaces? A Provably Efficient Approach
·1501 words·8 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
CATY-IRL: A novel, provably efficient algorithm solves Inverse Reinforcement Learning’s scalability issues for large state spaces, improving upon state-of-the-art methods.
Bandits with Ranking Feedback
·1499 words·8 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Politecnico Di Milano
This paper introduces ‘bandits with ranking feedback,’ a novel bandit variation providing ranked feedback instead of numerical rewards. It proves instance-dependent cases require superlogarithmic reg…