Skip to main content

🏢 CISPA

Universality of AdaGrad Stepsizes for Stochastic Optimization: Inexact Oracle, Acceleration and Variance Reduction
·1717 words·9 mins· loading · loading
AI Theory Optimization 🏢 CISPA
Adaptive gradient methods using AdaGrad stepsizes achieve optimal convergence rates for convex composite optimization problems, handling inexact oracles, acceleration, and variance reduction without n…
Dynamic Rescaling for Training GNNs
·1913 words·9 mins· loading · loading
Machine Learning Deep Learning 🏢 CISPA
Dynamic rescaling boosts GNN training by controlling layer learning speeds and balancing networks, leading to faster training and improved generalization, especially on heterophilic graphs.