Skip to main content

🏢 Institute of Science and Technology Austria (ISTA)

MicroAdam: Accurate Adaptive Optimization with Low Space Overhead and Provable Convergence
·2050 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Institute of Science and Technology Austria (ISTA)
MICROADAM: A new Adam optimizer variant dramatically cuts memory usage for training large language models without compromising accuracy or provable convergence.