AI Theory

Topological obstruction to the training of shallow ReLU neural networks

26 September 2024·1553 words·8 mins· loading · loading

AI Theory Optimization 🏢 Politecnico Di Torino

Shallow ReLU neural networks face topological training obstructions due to gradient flow confinement on disconnected quadric hypersurfaces.

Topological Generalization Bounds for Discrete-Time Stochastic Optimization Algorithms

26 September 2024·8286 words·39 mins· loading · loading

AI Generated AI Theory Generalization 🏢 University of Edinburgh

New topology-based complexity measures reliably predict deep learning model generalization, outperforming existing methods and offering practical computational efficiency.

Tighter Convergence Bounds for Shuffled SGD via Primal-Dual Perspective

26 September 2024·1717 words·9 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Wisconsin-Madison

Shuffled SGD’s convergence is now better understood through a primal-dual analysis, yielding tighter bounds that align with its superior empirical performance.

Tight Rates for Bandit Control Beyond Quadratics

26 September 2024·406 words·2 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Princeton University

This paper presents an algorithm achieving Õ(√T) optimal regret for bandit non-stochastic control with strongly-convex and smooth cost functions, overcoming prior limitations of suboptimal bounds.

Tight Bounds for Learning RUMs from Small Slates

26 September 2024·255 words·2 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Google Research

Learning user preferences accurately from limited data is key; this paper shows that surprisingly small datasets suffice for precise prediction, and provides efficient algorithms to achieve this.

Theoretical guarantees in KL for Diffusion Flow Matching

26 September 2024·242 words·2 mins· loading · loading

AI Generated AI Theory Generalization 🏢 École Polytechnique

Novel theoretical guarantees for Diffusion Flow Matching (DFM) models are established, bounding the KL divergence under mild assumptions on data and base distributions.

Theoretical Foundations of Deep Selective State-Space Models

26 September 2024·379 words·2 mins· loading · loading

AI Theory Generalization 🏢 Imperial College London

Deep learning’s sequence modeling is revolutionized by selective state-space models (SSMs)! This paper provides theoretical grounding for their superior performance, revealing the crucial role of gati…

Theoretical Characterisation of the Gauss Newton Conditioning in Neural Networks

26 September 2024·2952 words·14 mins· loading · loading

AI Theory Optimization 🏢 University of Basel

New theoretical bounds reveal how neural network architecture impacts the Gauss-Newton matrix’s conditioning, paving the way for improved optimization.

Theoretical and Empirical Insights into the Origins of Degree Bias in Graph Neural Networks

26 September 2024·2828 words·14 mins· loading · loading

AI Theory Fairness 🏢 University of California, Los Angeles

Researchers unveil the origins of degree bias in Graph Neural Networks (GNNs), proving high-degree nodes’ lower misclassification probability and proposing methods to alleviate this bias for fairer GN…

Theoretical Analysis of Weak-to-Strong Generalization

26 September 2024·1703 words·8 mins· loading · loading

AI Theory Generalization 🏢 MIT CSAIL

Strong student models can learn from weaker teachers, even correcting errors and generalizing beyond the teacher’s expertise. This paper provides new theoretical bounds explaining this ‘weak-to-strong…

The Surprising Effectiveness of SP Voting with Partial Preferences

26 September 2024·3640 words·18 mins· loading · loading

AI Theory Optimization 🏢 Penn State University

Partial preferences and noisy votes hinder accurate ranking recovery; this paper introduces scalable SP voting variants, empirically demonstrating superior performance in recovering ground truth ranki…

The Space Complexity of Approximating Logistic Loss

26 September 2024·359 words·2 mins· loading · loading

AI Theory Optimization 🏢 LinkedIn Corporation

This paper proves fundamental space complexity lower bounds for approximating logistic loss, revealing that existing coreset constructions are surprisingly optimal.

The Secretary Problem with Predicted Additive Gap

26 September 2024·1651 words·8 mins· loading · loading

AI Theory Optimization 🏢 Institute of Computer Science, University of Bonn

Beat the 1/e barrier in the secretary problem using only an additive gap prediction!

The Sample Complexity of Gradient Descent in Stochastic Convex Optimization

26 September 2024·336 words·2 mins· loading · loading

AI Theory Optimization 🏢 Tel Aviv University

Gradient descent’s sample complexity in non-smooth stochastic convex optimization is Õ(d/m+1/√m), matching worst-case ERMs and showing no advantage over naive methods.

The Reliability of OKRidge Method in Solving Sparse Ridge Regression Problems

26 September 2024·2340 words·11 mins· loading · loading

AI Theory Optimization 🏢 Wuhan University

OKRidge’s reliability for solving sparse ridge regression problems is rigorously proven through theoretical error analysis, enhancing its applicability in machine learning.

The Price of Implicit Bias in Adversarially Robust Generalization

26 September 2024·3000 words·15 mins· loading · loading

AI Generated AI Theory Robustness 🏢 New York University

Optimization’s implicit bias in robust machine learning hurts generalization; this work reveals how algorithm/architecture choices impact robustness, suggesting better optimization strategies are need…

The Power of Hard Attention Transformers on Data Sequences: A formal language theoretic perspective

26 September 2024·284 words·2 mins· loading · loading

AI Generated AI Theory Generalization 🏢 RPTU Kaiserslautern-Landau

Hard attention transformers show surprisingly greater power when processing numerical data sequences, exceeding capabilities on string data; this advancement is theoretically analyzed via circuit comp…

The motion planning neural circuit in goal-directed navigation as Lie group operator search

26 September 2024·1385 words·7 mins· loading · loading

AI Theory Representation Learning 🏢 UT Southwestern Medical Center

Neural circuits for goal-directed navigation are modeled as Lie group operator searches, implemented by a two-layer feedforward circuit mimicking Drosophila’s navigation system.

The Minimax Rate of HSIC Estimation for Translation-Invariant Kernels

26 September 2024·215 words·2 mins· loading · loading

AI Theory Optimization 🏢 Karlsruhe Institute of Technology

Researchers found the minimax optimal rate of HSIC estimation for translation-invariant kernels is O(n⁻¹/²), settling a two-decade-old open question and validating many existing HSIC estimators.

The Limits of Differential Privacy in Online Learning

26 September 2024·440 words·3 mins· loading · loading

AI Theory Privacy 🏢 Hong Kong University of Science and Technology

This paper reveals fundamental limits of differential privacy in online learning, demonstrating a clear separation between pure, approximate, and non-private settings.