Deep Learning

Sparse maximal update parameterization: A holistic approach to sparse training dynamics

26 September 2024·3095 words·15 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Cerebras Systems

SµPar stabilizes sparse neural network training, slashing tuning costs and boosting performance, especially at high sparsity levels, via a novel parameterization technique.

Sparse Bayesian Generative Modeling for Compressive Sensing

26 September 2024·2495 words·12 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 TUM School of Computation, Information and Technology

A new learnable prior for compressive sensing solves the inverse problem using only a few corrupted data samples, enabling sparse signal recovery without ground-truth information and uncertainty quant…

Sourcerer: Sample-based Maximum Entropy Source Distribution Estimation

26 September 2024·4767 words·23 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 University of Tübingen

Sourcerer: A novel sample-based method for maximum entropy source distribution estimation, resolving ill-posedness while maintaining simulation accuracy.

Soft ascent-descent as a stable and flexible alternative to flooding

26 September 2024·2106 words·10 mins· loading · loading

Machine Learning Deep Learning 🏢 Osaka University

Soft ascent-descent (SoftAD) improves test accuracy and generalization by softening the flooding method, offering competitive accuracy with reduced loss and model complexity.

Sketched Lanczos uncertainty score: a low-memory summary of the Fisher information

26 September 2024·2226 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 Technical University of Denmark

SLU: a novel, low-memory uncertainty score for neural networks, achieves logarithmic memory scaling with model parameters, providing well-calibrated uncertainties and outperforming existing methods.

Simulation-Free Training of Neural ODEs on Paired Data

26 September 2024·3545 words·17 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 KAIST

Train Neural ODEs without simulations, achieving high performance on regression and classification by using flow matching in the embedding space of data pairs.

Sigmoid Gating is More Sample Efficient than Softmax Gating in Mixture of Experts

26 September 2024·1350 words·7 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Texas at Austin

Sigmoid gating significantly boosts sample efficiency in Mixture of Experts models compared to softmax gating, offering faster convergence rates for various expert functions.

Sharpness-diversity tradeoff: improving flat ensembles with SharpBalance

26 September 2024·2661 words·13 mins· loading · loading

Machine Learning Deep Learning 🏢 UC San Diego

SharpBalance, a novel training approach, effectively improves deep ensemble performance by addressing the sharpness-diversity trade-off, leading to significant improvements in both in-distribution and…

Set-based Neural Network Encoding Without Weight Tying

26 September 2024·5047 words·24 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 University of Oxford

Set-based Neural Network Encoder (SNE) efficiently encodes neural network weights for property prediction, eliminating the need for architecture-specific models and improving generalization across dat…

SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization

26 September 2024·1692 words·8 mins· loading · loading

Machine Learning Deep Learning 🏢 Google Research

SequentialAttention++ unites differentiable pruning with combinatorial optimization for efficient and accurate neural network block sparsification, achieving state-of-the-art results.

Sequential Harmful Shift Detection Without Labels

26 September 2024·2657 words·13 mins· loading · loading

Machine Learning Deep Learning 🏢 J.P. Morgan AI Research

This paper introduces a novel, label-free method for detecting harmful distribution shifts in machine learning models deployed in production environments, leveraging a proxy error derived from an erro…

Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations

26 September 2024·2449 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Stanford University

Self-Refining Diffusion Samplers (SRDS) dramatically speeds up diffusion model sampling by leveraging Parareal iterations for parallel-in-time computation, maintaining high-quality outputs.

Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices

26 September 2024·2763 words·13 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 New York University

Revolutionizing large neural networks, this paper introduces a continuous parameterization of structured matrices, discovering that full-rank structures without parameter sharing achieve optimal scali…

Score-Optimal Diffusion Schedules

26 September 2024·2200 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Oxford

Researchers developed a novel algorithm to automatically find optimal schedules for denoising diffusion models (DDMs), significantly improving sample quality and efficiency without manual parameter tu…

Score-based 3D molecule generation with neural fields

26 September 2024·4106 words·20 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Prescient Design

FuncMol: A new neural field model generates 3D molecules efficiently, outperforming existing methods by achieving an order of magnitude faster sampling speed.

Scanning Trojaned Models Using Out-of-Distribution Samples

26 September 2024·2406 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Sharif University of Technology

TRODO: a novel trojan detection method using out-of-distribution samples, effectively identifies trojaned classifiers even against adversarial attacks and with limited data.

Scaling transformer neural networks for skillful and reliable medium-range weather forecasting

26 September 2024·2398 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 UC Los Angeles

Stormer, a simple transformer model, achieves state-of-the-art medium-range weather forecasting accuracy by using weather-specific embedding, randomized dynamics forecasting, and a pressure-weighted l…

Scaling laws for learning with real and surrogate data

26 September 2024·4942 words·24 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Granica Computing Inc.

Boost machine learning with surrogate data! A novel weighted ERM method effectively integrates surrogate data, significantly reducing test errors even with unrelated data, guided by a predictable sca…

Scaling Law for Time Series Forecasting

26 September 2024·2211 words·11 mins· loading · loading

Machine Learning Deep Learning 🏢 Tsinghua University

Unlocking the potential of deep learning for time series forecasting: this study reveals a scaling law influenced by dataset size, model complexity, and the crucial look-back horizon, leading to impro…

Scale-invariant Optimal Sampling for Rare-events Data and Sparse Models

26 September 2024·2539 words·12 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 University of Connecticut

Scale-invariant optimal subsampling tackles computational challenges in analyzing massive rare-events data with sparse models, enhancing parameter estimation and variable selection without being affec…