🏢 Institute of Mathematics
Deep linear networks for regression are implicitly regularized towards flat minima
·2602 words·13 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 Institute of Mathematics
Deep linear networks implicitly regularize towards flat minima, with sharpness (Hessian’s largest eigenvalue) of minimizers linearly increasing with depth but bounded by a constant times the lower bou…