Skip to main content

🏢 Institute of Mathematics

Deep linear networks for regression are implicitly regularized towards flat minima
·2602 words·13 mins· loading · loading
AI Generated AI Theory Optimization 🏢 Institute of Mathematics
Deep linear networks implicitly regularize towards flat minima, with sharpness (Hessian’s largest eigenvalue) of minimizers linearly increasing with depth but bounded by a constant times the lower bou…