Skip to main content

Deep Learning

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective
·1624 words·8 mins· loading · loading
Machine Learning Deep Learning 🏒 ShanghaiTech University
Deep Equilibrium Models excel on imbalanced data due to feature convergence and self-duality properties, unlike explicit models, as shown through Neural Collapse analysis.
Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure
·3934 words·19 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Michigan
Diffusion models’ surprising generalizability stems from an inductive bias towards learning Gaussian data structures, a finding that reshapes our understanding of their training and generalization.
Unconditional stability of a recurrent neural circuit implementing divisive normalization
·2728 words·13 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Courant Institute of Mathematical Sciences, NYU
Biologically-inspired ORGANICs neural circuit achieves dynamic divisive normalization, ensuring unconditional stability and seamless backpropagation training for high-dimensional recurrent networks.
UGC: Universal Graph Coarsening
·2262 words·11 mins· loading · loading
Machine Learning Deep Learning 🏒 Yardi School of Artificial Intelligence
UGC: Blazing-fast graph coarsening for big data, preserving key insights across diverse graph types.
TuneTables: Context Optimization for Scalable Prior-Data Fitted Networks
·4869 words·23 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 New York University
TuneTables optimizes PFNs for scalability via context optimization, achieving state-of-the-art performance on large tabular datasets while using fewer parameters and reducing inference time.
Treeffuser: probabilistic prediction via conditional diffusions with gradient-boosted trees
·2082 words·10 mins· loading · loading
Machine Learning Deep Learning 🏒 Department of Computer Science, Columbia University
Treeffuser: Accurate probabilistic predictions from tabular data using conditional diffusion models and gradient-boosted trees!
Transferable Boltzmann Generators
·4942 words·24 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Freie UniversitÀt Berlin
Transferable Boltzmann Generators enable efficient, zero-shot sampling of unseen molecular systems’ equilibrium distributions, boosting molecular simulations.
Training Binary Neural Networks via Gaussian Variational Inference and Low-Rank Semidefinite Programming
·1655 words·8 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 University of Chicago
VISPA, a novel BNN training framework using Gaussian variational inference and low-rank SDP, achieves state-of-the-art accuracy on various benchmarks.
Towards Exact Gradient-based Training on Analog In-memory Computing
·1654 words·8 mins· loading · loading
Machine Learning Deep Learning 🏒 Rensselaer Polytechnic Institute
Analog in-memory computing (AIMC) training suffers from asymptotic errors due to asymmetric updates. This paper rigorously proves this limitation, proposes a novel discrete-time model to characterize …
Towards Dynamic Message Passing on Graphs
·2834 words·14 mins· loading · loading
Machine Learning Deep Learning 🏒 Institute of Computing Technology, CAS
N2: A novel dynamic message-passing GNN tackles message-passing bottlenecks and high computational costs by introducing learnable pseudo-nodes and dynamic pathways in a common state space, achieving s…
Towards a Scalable Reference-Free Evaluation of Generative Models
·3926 words·19 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Chinese University of Hong Kong
FKEA: a novel, scalable method for reference-free evaluation of generative models’ diversity using random Fourier features, overcoming computational limitations of existing entropy-based scores.
TinyTTA: Efficient Test-time Adaptation via Early-exit Ensembles on Edge Devices
·2263 words·11 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Cambridge
TinyTTA enables efficient test-time adaptation on memory-constrained edge devices using a novel self-ensemble and early-exit strategy, improving accuracy and reducing memory usage.
TimeXer: Empowering Transformers for Time Series Forecasting with Exogenous Variables
·2924 words·14 mins· loading · loading
Machine Learning Deep Learning 🏒 Tsinghua University
TimeXer empowers transformers for superior time series forecasting by cleverly integrating exogenous variables, achieving state-of-the-art results on diverse benchmarks.
Time Makes Space: Emergence of Place Fields in Networks Encoding Temporally Continuous Sensory Experiences
·3838 words·19 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 University of Pennsylvania
Networks trained on continuous sensory data spontaneously develop place cell-like responses, demonstrating that time-encoded experience can create spatial maps in the brain.
The Selective $G$-Bispectrum and its Inversion: Applications to $G$-Invariant Networks
·2369 words·12 mins· loading · loading
Machine Learning Deep Learning 🏒 UCLouvain
This paper introduces a selective G-Bispectrum algorithm, slashing the computational complexity from O(|G|^2) to O(|G|), making G-invariant deep learning faster and more scalable.
The Prevalence of Neural Collapse in Neural Multivariate Regression
·2414 words·12 mins· loading · loading
Machine Learning Deep Learning 🏒 New York University Abu Dhabi
Neural networks exhibit ‘Neural Regression Collapse’ (NRC) during training, where feature vectors collapse to subspaces spanned by principal components of features and weights, and the weight vector G…
The Poisson Midpoint Method for Langevin Dynamics: Provably Efficient Discretization for Diffusion Models
·2297 words·11 mins· loading · loading
Machine Learning Deep Learning 🏒 Cornell University
Poisson Midpoint Method quadratically accelerates Langevin Monte Carlo for diffusion models, achieving high-quality image generation with significantly fewer computations.
The Iterative Optimal Brain Surgeon: Faster Sparse Recovery by Leveraging Second-Order Information
·1778 words·9 mins· loading · loading
Machine Learning Deep Learning 🏒 Institute of Science and Technology Austria
I-OBS, a novel family of sparse recovery algorithms leveraging second-order information, achieves faster convergence rates for sparse DNNs, validated by large-scale experiments.
The Importance of Being Scalable: Improving the Speed and Accuracy of Neural Network Interatomic Potentials Across Chemical Domains
·1885 words·9 mins· loading · loading
Machine Learning Deep Learning 🏒 UC Berkeley
ESCAIP, a novel neural network architecture, dramatically boosts the speed and accuracy of atomic simulations by leveraging attention mechanisms, enabling efficient large-scale modeling across diverse…
The Implicit Bias of Gradient Descent on Separable Multiclass Data
·1300 words·7 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Michigan
Researchers extended implicit bias theory to multiclass classification using a novel framework, proving that gradient descent prefers simple solutions even with complex alternatives.