Optimization
On Sparse Canonical Correlation Analysis
·2005 words·10 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 University of Tennessee
This paper presents novel, efficient algorithms and formulations for Sparse Canonical Correlation Analysis (SCCA), a method that improves the interpretability of traditional CCA. SCCA is especially us…
On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability
·2131 words·11 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 Gaoling School of Artificial Intelligence, Renmin University of China
Autoregressively trained transformers surprisingly learn algorithms during pretraining, enabling in-context learning; this paper reveals when and why this ‘mesa-optimization’ happens.
On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions
·308 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 Zhejiang University
Adam optimizer achieves near-optimal convergence in non-convex scenarios with unbounded gradients and relaxed noise assumptions, improving its theoretical understanding and practical application.
Non-geodesically-convex optimization in the Wasserstein space
·332 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 Department of Computer Science, University of Helsinki
A novel semi Forward-Backward Euler scheme provides convergence guarantees for non-geodesically-convex optimization in Wasserstein space, advancing both sampling and optimization.
Non-asymptotic Global Convergence Analysis of BFGS with the Armijo-Wolfe Line Search
·523 words·3 mins·
loading
·
loading
Optimization
🏢 University of Texas at Austin
BFGS algorithm achieves global linear and superlinear convergence rates with inexact Armijo-Wolfe line search, even without precise Hessian knowledge.
Non-asymptotic Convergence of Training Transformers for Next-token Prediction
·361 words·2 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 Penn State University
This paper reveals how a one-layer transformer’s training converges for next-token prediction, showing sub-linear convergence for both layers and shedding light on its surprising generalization abilit…
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
·2753 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Optimization
🏢 Sorbonne Université
This paper rigorously analyzes biased adaptive stochastic gradient descent (SGD), proving convergence to critical points for non-convex functions even with biased gradient estimations. The analysis c…
No-Regret M${}^{
atural}$-Concave Function Maximization: Stochastic Bandit Algorithms and NP-Hardness of Adversarial Full-Information Setting
·1615 words·8 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 Hokkaido University
This paper reveals efficient stochastic bandit algorithms for maximizing M-concave functions and proves NP-hardness for adversarial full-information settings.
No-regret Learning in Harmonic Games: Extrapolation in the Face of Conflicting Interests
·354 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Oxford
Extrapolated FTRL ensures Nash equilibrium convergence in harmonic games, defying standard no-regret learning limitations.
No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation
·532 words·3 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 University of Birmingham
No free lunch for adversarial optimization: This paper proves that no single algorithm universally outperforms others when finding Nash Equilibrium, introducing black-box complexity analysis to estab…
Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms
·2334 words·11 mins·
loading
·
loading
Machine Learning
Optimization
🏢 Stanford University
Newton Losses enhance training of neural networks with complex objectives by using second-order information from loss functions, achieving significant performance gains.
Neural Pfaffians: Solving Many Many-Electron Schrödinger Equations
·2649 words·13 mins·
loading
·
loading
AI Theory
Optimization
🏢 Technical University of Munich
Neural Pfaffians revolutionize many-electron Schrödinger equation solutions by using fully learnable neural wave functions based on Pfaffians, achieving unprecedented accuracy and generalizability acr…
Neural Network Reparametrization for Accelerated Optimization in Molecular Simulations
·2783 words·14 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 IBM Research
Accelerate molecular simulations using neural network reparametrization! This flexible method adjusts system complexity, enhances optimization, and maintains continuous access to fine-grained modes, o…
Neural Combinatorial Optimization for Robust Routing Problem with Uncertain Travel Times
·2186 words·11 mins·
loading
·
loading
AI Theory
Optimization
🏢 Sun Yat-Sen University
Neural networks efficiently solve robust routing problems with uncertain travel times, minimizing worst-case deviations from optimal routes under the min-max regret criterion.
Neural collapse vs. low-rank bias: Is deep neural collapse really optimal?
·2988 words·15 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 Institute of Science and Technology Austria
Deep neural collapse, previously believed optimal, is shown suboptimal in multi-class, multi-layer networks due to a low-rank bias, yielding even lower-rank solutions.
Neur2BiLO: Neural Bilevel Optimization
·2909 words·14 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Toronto
NEUR2BILO: a neural network-based heuristic solves mixed-integer bilevel optimization problems extremely fast, achieving high-quality solutions for diverse applications.
Nesterov acceleration despite very noisy gradients
·2415 words·12 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
🏢 University of Pittsburgh
AGNES, a novel accelerated gradient descent algorithm, achieves accelerated convergence even with very noisy gradients, significantly improving training efficiency for machine learning models.
Nearly Optimal Approximation of Matrix Functions by the Lanczos Method
·1646 words·8 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Washington
Lanczos-FA, a simple algorithm for approximating matrix functions, surprisingly outperforms newer methods; this paper proves its near-optimality for rational functions, explaining its practical succes…
Nearly Minimax Optimal Submodular Maximization with Bandit Feedback
·384 words·2 mins·
loading
·
loading
AI Theory
Optimization
🏢 University of Washington
This research establishes the first minimax optimal algorithm for submodular maximization with bandit feedback, achieving a regret bound matching the lower bound.
Nearly Minimax Optimal Regret for Multinomial Logistic Bandit
·1353 words·7 mins·
loading
·
loading
AI Theory
Optimization
🏢 Seoul National University
This paper presents OFU-MNL+, a constant-time algorithm achieving nearly minimax optimal regret for contextual multinomial logistic bandits, closing the gap between existing upper and lower bounds.