Optimization

On Sparse Canonical Correlation Analysis

26 September 2024·2005 words·10 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Tennessee

This paper presents novel, efficient algorithms and formulations for Sparse Canonical Correlation Analysis (SCCA), a method that improves the interpretability of traditional CCA. SCCA is especially us…

On Mesa-Optimization in Autoregressively Trained Transformers: Emergence and Capability

26 September 2024·2131 words·11 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Gaoling School of Artificial Intelligence, Renmin University of China

Autoregressively trained transformers surprisingly learn algorithms during pretraining, enabling in-context learning; this paper reveals when and why this ‘mesa-optimization’ happens.

On Convergence of Adam for Stochastic Optimization under Relaxed Assumptions

26 September 2024·308 words·2 mins· loading · loading

AI Theory Optimization 🏢 Zhejiang University

Adam optimizer achieves near-optimal convergence in non-convex scenarios with unbounded gradients and relaxed noise assumptions, improving its theoretical understanding and practical application.

Non-geodesically-convex optimization in the Wasserstein space

26 September 2024·332 words·2 mins· loading · loading

AI Theory Optimization 🏢 Department of Computer Science, University of Helsinki

A novel semi Forward-Backward Euler scheme provides convergence guarantees for non-geodesically-convex optimization in Wasserstein space, advancing both sampling and optimization.

Non-asymptotic Global Convergence Analysis of BFGS with the Armijo-Wolfe Line Search

26 September 2024·523 words·3 mins· loading · loading

Optimization 🏢 University of Texas at Austin

BFGS algorithm achieves global linear and superlinear convergence rates with inexact Armijo-Wolfe line search, even without precise Hessian knowledge.

Non-asymptotic Convergence of Training Transformers for Next-token Prediction

26 September 2024·361 words·2 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Penn State University

This paper reveals how a one-layer transformer’s training converges for next-token prediction, showing sub-linear convergence for both layers and shedding light on its surprising generalization abilit…

Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation

26 September 2024·2753 words·13 mins· loading · loading

AI Generated Machine Learning Optimization 🏢 Sorbonne Université

This paper rigorously analyzes biased adaptive stochastic gradient descent (SGD), proving convergence to critical points for non-convex functions even with biased gradient estimations. The analysis c…

No-Regret M${}^{ atural}$-Concave Function Maximization: Stochastic Bandit Algorithms and NP-Hardness of Adversarial Full-Information Setting

26 September 2024·1615 words·8 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Hokkaido University

This paper reveals efficient stochastic bandit algorithms for maximizing M-concave functions and proves NP-hardness for adversarial full-information settings.

No-regret Learning in Harmonic Games: Extrapolation in the Face of Conflicting Interests

26 September 2024·354 words·2 mins· loading · loading

AI Theory Optimization 🏢 University of Oxford

Extrapolated FTRL ensures Nash equilibrium convergence in harmonic games, defying standard no-regret learning limitations.

No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation

26 September 2024·532 words·3 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Birmingham

No free lunch for adversarial optimization: This paper proves that no single algorithm universally outperforms others when finding Nash Equilibrium, introducing black-box complexity analysis to estab…

Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms

26 September 2024·2334 words·11 mins· loading · loading

Machine Learning Optimization 🏢 Stanford University

Newton Losses enhance training of neural networks with complex objectives by using second-order information from loss functions, achieving significant performance gains.

Neural Pfaffians: Solving Many Many-Electron Schrödinger Equations

26 September 2024·2649 words·13 mins· loading · loading

AI Theory Optimization 🏢 Technical University of Munich

Neural Pfaffians revolutionize many-electron Schrödinger equation solutions by using fully learnable neural wave functions based on Pfaffians, achieving unprecedented accuracy and generalizability acr…

Neural Network Reparametrization for Accelerated Optimization in Molecular Simulations

26 September 2024·2783 words·14 mins· loading · loading

AI Generated AI Theory Optimization 🏢 IBM Research

Accelerate molecular simulations using neural network reparametrization! This flexible method adjusts system complexity, enhances optimization, and maintains continuous access to fine-grained modes, o…

Neural Combinatorial Optimization for Robust Routing Problem with Uncertain Travel Times

26 September 2024·2186 words·11 mins· loading · loading

AI Theory Optimization 🏢 Sun Yat-Sen University

Neural networks efficiently solve robust routing problems with uncertain travel times, minimizing worst-case deviations from optimal routes under the min-max regret criterion.

Neural collapse vs. low-rank bias: Is deep neural collapse really optimal?

26 September 2024·2988 words·15 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Institute of Science and Technology Austria

Deep neural collapse, previously believed optimal, is shown suboptimal in multi-class, multi-layer networks due to a low-rank bias, yielding even lower-rank solutions.

Neur2BiLO: Neural Bilevel Optimization

26 September 2024·2909 words·14 mins· loading · loading

AI Theory Optimization 🏢 University of Toronto

NEUR2BILO: a neural network-based heuristic solves mixed-integer bilevel optimization problems extremely fast, achieving high-quality solutions for diverse applications.

Nesterov acceleration despite very noisy gradients

26 September 2024·2415 words·12 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Pittsburgh

AGNES, a novel accelerated gradient descent algorithm, achieves accelerated convergence even with very noisy gradients, significantly improving training efficiency for machine learning models.

Nearly Optimal Approximation of Matrix Functions by the Lanczos Method

26 September 2024·1646 words·8 mins· loading · loading

AI Theory Optimization 🏢 University of Washington

Lanczos-FA, a simple algorithm for approximating matrix functions, surprisingly outperforms newer methods; this paper proves its near-optimality for rational functions, explaining its practical succes…

Nearly Minimax Optimal Submodular Maximization with Bandit Feedback

26 September 2024·384 words·2 mins· loading · loading

AI Theory Optimization 🏢 University of Washington

This research establishes the first minimax optimal algorithm for submodular maximization with bandit feedback, achieving a regret bound matching the lower bound.

Nearly Minimax Optimal Regret for Multinomial Logistic Bandit

26 September 2024·1353 words·7 mins· loading · loading

AI Theory Optimization 🏢 Seoul National University

This paper presents OFU-MNL+, a constant-time algorithm achieving nearly minimax optimal regret for contextual multinomial logistic bandits, closing the gap between existing upper and lower bounds.