Skip to main content

🏢 Université De Montréal

Improving Deep Reinforcement Learning by Reducing the Chain Effect of Value and Policy Churn
·3413 words·17 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 Université De Montréal
Deep RL agents often suffer from instability due to the ‘chain effect’ of value and policy churn; this paper introduces CHAIN, a novel method to reduce this churn, thereby improving DRL performance an…