Deep Learning
Sparse maximal update parameterization: A holistic approach to sparse training dynamics
·3095 words·15 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Cerebras Systems
SΒ΅Par stabilizes sparse neural network training, slashing tuning costs and boosting performance, especially at high sparsity levels, via a novel parameterization technique.
Sparse Bayesian Generative Modeling for Compressive Sensing
·2495 words·12 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ TUM School of Computation, Information and Technology
A new learnable prior for compressive sensing solves the inverse problem using only a few corrupted data samples, enabling sparse signal recovery without ground-truth information and uncertainty quant…
Sourcerer: Sample-based Maximum Entropy Source Distribution Estimation
·4767 words·23 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ University of TΓΌbingen
Sourcerer: A novel sample-based method for maximum entropy source distribution estimation, resolving ill-posedness while maintaining simulation accuracy.
Soft ascent-descent as a stable and flexible alternative to flooding
·2106 words·10 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Osaka University
Soft ascent-descent (SoftAD) improves test accuracy and generalization by softening the flooding method, offering competitive accuracy with reduced loss and model complexity.
Sketched Lanczos uncertainty score: a low-memory summary of the Fisher information
·2226 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Technical University of Denmark
SLU: a novel, low-memory uncertainty score for neural networks, achieves logarithmic memory scaling with model parameters, providing well-calibrated uncertainties and outperforming existing methods.
Simulation-Free Training of Neural ODEs on Paired Data
·3545 words·17 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ KAIST
Train Neural ODEs without simulations, achieving high performance on regression and classification by using flow matching in the embedding space of data pairs.
Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts
·1350 words·7 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ University of Texas at Austin
Sigmoid gating significantly boosts sample efficiency in Mixture of Experts models compared to softmax gating, offering faster convergence rates for various expert functions.
Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance
·2661 words·13 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ UC San Diego
SharpBalance, a novel training approach, effectively improves deep ensemble performance by addressing the sharpness-diversity trade-off, leading to significant improvements in both in-distribution and…
Set-based Neural Network Encoding Without Weight Tying
·5047 words·24 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ University of Oxford
Set-based Neural Network Encoder (SNE) efficiently encodes neural network weights for property prediction, eliminating the need for architecture-specific models and improving generalization across dat…
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
·1692 words·8 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Google Research
SequentialAttention++ unites differentiable pruning with combinatorial optimization for efficient and accurate neural network block sparsification, achieving state-of-the-art results.
Sequential Harmful Shift Detection Without Labels
·2657 words·13 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ J.P. Morgan AI Research
This paper introduces a novel, label-free method for detecting harmful distribution shifts in machine learning models deployed in production environments, leveraging a proxy error derived from an erro…
Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations
·2449 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Stanford University
Self-Refining Diffusion Samplers (SRDS) dramatically speeds up diffusion model sampling by leveraging Parareal iterations for parallel-in-time computation, maintaining high-quality outputs.
Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
·2763 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ New York University
Revolutionizing large neural networks, this paper introduces a continuous parameterization of structured matrices, discovering that full-rank structures without parameter sharing achieve optimal scali…
Score-Optimal Diffusion Schedules
·2200 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ University of Oxford
Researchers developed a novel algorithm to automatically find optimal schedules for denoising diffusion models (DDMs), significantly improving sample quality and efficiency without manual parameter tu…
Score-based 3D molecule generation with neural fields
·4106 words·20 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Prescient Design
FuncMol: A new neural field model generates 3D molecules efficiently, outperforming existing methods by achieving an order of magnitude faster sampling speed.
Scanning Trojaned Models Using Out-of-Distribution Samples
·2406 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Sharif University of Technology
TRODO: a novel trojan detection method using out-of-distribution samples, effectively identifies trojaned classifiers even against adversarial attacks and with limited data.
Scaling transformer neural networks for skillful and reliable medium-range weather forecasting
·2398 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ UC Los Angeles
Stormer, a simple transformer model, achieves state-of-the-art medium-range weather forecasting accuracy by using weather-specific embedding, randomized dynamics forecasting, and a pressure-weighted l…
Scaling laws for learning with real and surrogate data
·4942 words·24 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Granica Computing Inc.
Boost machine learning with surrogate data! A novel weighted ERM method effectively integrates surrogate data, significantly reducing test errors even with unrelated data, guided by a predictable sca…
Scaling Law for Time Series Forecasting
·2211 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Tsinghua University
Unlocking the potential of deep learning for time series forecasting: this study reveals a scaling law influenced by dataset size, model complexity, and the crucial look-back horizon, leading to impro…
Scale-invariant Optimal Sampling for Rare-events Data and Sparse Models
·2539 words·12 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ University of Connecticut
Scale-invariant optimal subsampling tackles computational challenges in analyzing massive rare-events data with sparse models, enhancing parameter estimation and variable selection without being affec…