Posters
2024
The Space Complexity of Approximating Logistic Loss
·359 words·2 mins·
loading
·
loading
AI Theory
Optimization
๐ข LinkedIn Corporation
This paper proves fundamental space complexity lower bounds for approximating logistic loss, revealing that existing coreset constructions are surprisingly optimal.
The Selective $G$-Bispectrum and its Inversion: Applications to $G$-Invariant Networks
·2369 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข UCLouvain
This paper introduces a selective G-Bispectrum algorithm, slashing the computational complexity from O(|G|^2) to O(|G|), making G-invariant deep learning faster and more scalable.
The Secretary Problem with Predicted Additive Gap
·1651 words·8 mins·
loading
·
loading
AI Theory
Optimization
๐ข Institute of Computer Science, University of Bonn
Beat the 1/e barrier in the secretary problem using only an additive gap prediction!
The Sample Complexity of Gradient Descent in Stochastic Convex Optimization
·336 words·2 mins·
loading
·
loading
AI Theory
Optimization
๐ข Tel Aviv University
Gradient descent’s sample complexity in non-smooth stochastic convex optimization is ร(d/m+1/โm), matching worst-case ERMs and showing no advantage over naive methods.
The Representation Landscape of Few-Shot Learning and Fine-Tuning in Large Language Models
·3617 words·17 mins·
loading
·
loading
Natural Language Processing
Large Language Models
๐ข Area Science Park
LLMs use different internal structures for few-shot learning and fine-tuning, showing a transition in the middle network layers that impacts information encoding and task solving strategies.
The Reliability of OKRidge Method in Solving Sparse Ridge Regression Problems
·2340 words·11 mins·
loading
·
loading
AI Theory
Optimization
๐ข Wuhan University
OKRidge’s reliability for solving sparse ridge regression problems is rigorously proven through theoretical error analysis, enhancing its applicability in machine learning.
The Price of Implicit Bias in Adversarially Robust Generalization
·3000 words·15 mins·
loading
·
loading
AI Generated
AI Theory
Robustness
๐ข New York University
Optimization’s implicit bias in robust machine learning hurts generalization; this work reveals how algorithm/architecture choices impact robustness, suggesting better optimization strategies are need…
The Prevalence of Neural Collapse in Neural Multivariate Regression
·2414 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข New York University Abu Dhabi
Neural networks exhibit ‘Neural Regression Collapse’ (NRC) during training, where feature vectors collapse to subspaces spanned by principal components of features and weights, and the weight vector G…
The Power of Hard Attention Transformers on Data Sequences: A formal language theoretic perspective
·284 words·2 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
๐ข RPTU Kaiserslautern-Landau
Hard attention transformers show surprisingly greater power when processing numerical data sequences, exceeding capabilities on string data; this advancement is theoretically analyzed via circuit comp…
The Power of Extrapolation in Federated Learning
·2710 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Federated Learning
๐ข GenAI Center of Excellence
Federated learning gets a speed boost: New extrapolation strategies significantly improve FedProx’s convergence, offering both theoretical backing and practical enhancements.
The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models
·2297 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข Cornell University
Poisson Midpoint Method quadratically accelerates Langevin Monte Carlo for diffusion models, achieving high-quality image generation with significantly fewer computations.
The motion planning neural circuit in goal-directed navigation as Lie group operator search
·1385 words·7 mins·
loading
·
loading
AI Theory
Representation Learning
๐ข UT Southwestern Medical Center
Neural circuits for goal-directed navigation are modeled as Lie group operator searches, implemented by a two-layer feedforward circuit mimicking Drosophila’s navigation system.
The Minimax Rate of HSIC Estimation for Translation-Invariant Kernels
·215 words·2 mins·
loading
·
loading
AI Theory
Optimization
๐ข Karlsruhe Institute of Technology
Researchers found the minimax optimal rate of HSIC estimation for translation-invariant kernels is O(nโปยน/ยฒ), settling a two-decade-old open question and validating many existing HSIC estimators.
The Map Equation Goes Neural: Mapping Network Flows with Graph Neural Networks
·3312 words·16 mins·
loading
·
loading
Machine Learning
Unsupervised Learning
๐ข University of Zurich
Neuromap leverages graph neural networks to optimize the map equation for community detection, achieving competitive performance and automatically determining the optimal number of clusters.
The Many Faces of Optimal Weak-to-Strong Learning
·1344 words·7 mins·
loading
·
loading
Machine Learning
Optimization
๐ข Aarhus University
A new, surprisingly simple boosting algorithm achieves provably optimal sample complexity and outperforms existing algorithms on large datasets.
The Mamba in the Llama: Distilling and Accelerating Hybrid Models
·2037 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
๐ข Cornell University
This research dramatically accelerates and improves hybrid language models by distilling large Transformers into linear RNNs, achieving performance comparable to the original Transformer with signific…
The Limits of Transfer Reinforcement Learning with Latent Low-rank Structure
·1762 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
๐ข Cornell University
This paper presents computationally efficient transfer reinforcement learning algorithms that remove the dependence on state/action space sizes while achieving minimax optimality.
The Limits of Differential Privacy in Online Learning
·440 words·3 mins·
loading
·
loading
AI Theory
Privacy
๐ข Hong Kong University of Science and Technology
This paper reveals fundamental limits of differential privacy in online learning, demonstrating a clear separation between pure, approximate, and non-private settings.
The Ladder in Chaos: Improving Policy Learning by Harnessing the Parameter Evolving Path in A Low-dimensional Space
·2918 words·14 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
๐ข College of Intelligence and Computing, Tianjin University
Deep RL policy learning is improved by identifying and boosting key parameter update directions using a novel temporal SVD analysis, leading to more efficient and effective learning.
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
·1778 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
๐ข Institute of Science and Technology Austria
I-OBS, a novel family of sparse recovery algorithms leveraging second-order information, achieves faster convergence rates for sparse DNNs, validated by large-scale experiments.