↓Skip to main content

🏢 CLAIRE, EPFL

No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO

26 September 2024·5380 words·26 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 CLAIRE, EPFL

Deep RL agents trained under non-stationarity suffer performance collapse due to representation degradation; this work reveals this in PPO and introduces Proximal Feature Optimization (PFO) to mitigat…

Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers

26 September 2024·2873 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 CLAIRE, EPFL

Training large language models efficiently is key; this paper shows how using structured feedforward layers and a novel training regime significantly reduces computational costs and improves training …