Generalization

On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory

26 September 2024·1555 words·8 mins· loading · loading

AI Theory Generalization 🏢 Tsinghua University

Standard initialization in neural networks negatively impacts generalization ability under Neural Tangent Kernel theory, contradicting real-world performance, urging the development of improved theore…

On the Expressivity and Sample Complexity of Node-Individualized Graph Neural Networks

26 September 2024·2191 words·11 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Max Planck Institute of Biochemistry

Boosting GNN expressivity and generalization: Novel node individualization schemes lower sample complexity, improving substructure identification.

On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)

26 September 2024·402 words·2 mins· loading · loading

AI Theory Generalization 🏢 Northwestern University

Latent Diffusion Transformers (DiTs) achieve almost-linear time training and inference through low-rank gradient approximations and efficient criteria, overcoming high dimensionality challenges.

On Feature Learning in Structured State Space Models

26 September 2024·1624 words·8 mins· loading · loading

AI Theory Generalization 🏢 AGI Foundations

Unlocking the scaling secrets of structured state-space models, this research identifies novel scaling rules for improved stability, generalization, and hyperparameter transferability, revolutionizing…

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

26 September 2024·2178 words·11 mins· loading · loading

AI Theory Generalization 🏢 Meta AI

Passive data collection in complex social systems invalidates standard AI model validation; new methods are needed.

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

26 September 2024·455 words·3 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Princeton University

SGD can train neural networks to learn low-dimensional polynomials near the information-theoretic limit, surpassing previous correlational statistical query lower bounds.

Model Collapse Demystified: The Case of Regression

26 September 2024·1683 words·8 mins· loading · loading

AI Theory Generalization 🏢 Meta

Training AI models on AI-generated data leads to performance degradation, known as model collapse. This paper offers analytical formulas that precisely quantify this effect in high-dimensional regress…

Least Squares Regression Can Exhibit Under-Parameterized Double Descent

26 September 2024·3874 words·19 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Applied Math, Yale University

Under-parameterized linear regression models can surprisingly exhibit double descent, contradicting traditional bias-variance assumptions.

Information-theoretic Generalization Analysis for Expected Calibration Error

26 September 2024·1937 words·10 mins· loading · loading

AI Theory Generalization 🏢 Osaka University

New theoretical analysis reveals optimal binning strategies for minimizing bias in expected calibration error (ECE), improving machine learning model calibration evaluation.

Improving Adaptivity via Over-Parameterization in Sequence Models

26 September 2024·2081 words·10 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Tsinghua University

Over-parameterized gradient descent dynamically adapts to signal structure, improving sequence model generalization and outperforming fixed-kernel methods.

Implicit Regularization Paths of Weighted Neural Representations

26 September 2024·1797 words·9 mins· loading · loading

AI Theory Generalization 🏢 Carnegie Mellon University

Weighted pretrained features implicitly regularize models, and this paper reveals equivalent paths between weighting schemes and ridge regularization, enabling efficient hyperparameter tuning.

Graph Neural Networks and Arithmetic Circuits

26 September 2024·465 words·3 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Leibniz University Hanover

Graph Neural Networks’ (GNNs) computational power precisely mirrors that of arithmetic circuits, as proven via a novel C-GNN model; this reveals fundamental limits to GNN scalability.

Generalization of Hamiltonian algorithms

26 September 2024·344 words·2 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Istituto Italiano Di Tecnologia

New, tighter generalization bounds are derived for a class of stochastic learning algorithms that generate absolutely continuous probability distributions; enhancing our understanding of their perform…

Generalization Error Bounds for Two-stage Recommender Systems with Tree Structure

26 September 2024·386 words·2 mins· loading · loading

AI Theory Generalization 🏢 University of Science and Technology of China

Two-stage recommender systems using tree structures achieve better generalization with more branches and harmonized training data distributions across stages.

Generalization Bounds via Conditional $f$-Information

26 September 2024·358 words·2 mins· loading · loading

AI Theory Generalization 🏢 Tongji University

New information-theoretic generalization bounds, based on conditional f-information, improve existing methods by addressing unboundedness and offering a generic approach applicable to various loss fun…

Generalizablity of Memorization Neural Network

26 September 2024·1319 words·7 mins· loading · loading

AI Theory Generalization 🏢 Chinese Academy of Sciences

Unlocking deep learning’s generalization mystery, this research pioneers a theoretical understanding of memorization neural network generalizability, revealing critical network structural requirements…

Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization

26 September 2024·2020 words·10 mins· loading · loading

AI Theory Generalization 🏢 Monash University

Eigen-SAM significantly boosts generalization in deep learning by directly addressing SAM’s limitations through explicit top Hessian eigenvalue regularization.

Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data

26 September 2024·5519 words·26 mins· loading · loading

AI Generated Machine Learning Generalization 🏢 Technical University of Munich

Drift-Resilient TabPFN masters temporal data shifts!

Dissecting the Failure of Invariant Learning on Graphs

26 September 2024·4452 words·21 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Peking University

Cross-environment Intra-class Alignment (CIA) and its label-free variant, CIA-LRA, significantly improve node-level OOD generalization on graphs by aligning representations and eliminating spurious fe…

Dimension-free deterministic equivalents and scaling laws for random feature regression

26 September 2024·1898 words·9 mins· loading · loading

AI Theory Generalization 🏢 École Normale Supérieure

This work delivers dimension-free deterministic equivalents for random feature regression, revealing sharp excess error rates and scaling laws.