Generalization
On the Impacts of the Random Initialization in the Neural Tangent Kernel Theory
·1555 words·8 mins·
loading
·
loading
AI Theory
Generalization
π’ Tsinghua University
Standard initialization in neural networks negatively impacts generalization ability under Neural Tangent Kernel theory, contradicting real-world performance, urging the development of improved theore…
On the Expressivity and Sample Complexity of Node-Individualized Graph Neural Networks
·2191 words·11 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Max Planck Institute of Biochemistry
Boosting GNN expressivity and generalization: Novel node individualization schemes lower sample complexity, improving substructure identification.
On Statistical Rates and Provably Efficient Criteria of Latent Diffusion Transformers (DiTs)
·402 words·2 mins·
loading
·
loading
AI Theory
Generalization
π’ Northwestern University
Latent Diffusion Transformers (DiTs) achieve almost-linear time training and inference through low-rank gradient approximations and efficient criteria, overcoming high dimensionality challenges.
On Feature Learning in Structured State Space Models
·1624 words·8 mins·
loading
·
loading
AI Theory
Generalization
π’ AGI Foundations
Unlocking the scaling secrets of structured state-space models, this research identifies novel scaling rules for improved stability, generalization, and hyperparameter transferability, revolutionizing…
No Free Delivery Service: Epistemic limits of passive data collection in complex social systems
·2178 words·11 mins·
loading
·
loading
AI Theory
Generalization
π’ Meta AI
Passive data collection in complex social systems invalidates standard AI model validation; new methods are needed.
Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit
·455 words·3 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Princeton University
SGD can train neural networks to learn low-dimensional polynomials near the information-theoretic limit, surpassing previous correlational statistical query lower bounds.
Model Collapse Demystified: The Case of Regression
·1683 words·8 mins·
loading
·
loading
AI Theory
Generalization
π’ Meta
Training AI models on AI-generated data leads to performance degradation, known as model collapse. This paper offers analytical formulas that precisely quantify this effect in high-dimensional regress…
Least Squares Regression Can Exhibit Under-Parameterized Double Descent
·3874 words·19 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Applied Math, Yale University
Under-parameterized linear regression models can surprisingly exhibit double descent, contradicting traditional bias-variance assumptions.
Information-theoretic Generalization Analysis for Expected Calibration Error
·1937 words·10 mins·
loading
·
loading
AI Theory
Generalization
π’ Osaka University
New theoretical analysis reveals optimal binning strategies for minimizing bias in expected calibration error (ECE), improving machine learning model calibration evaluation.
Improving Adaptivity via Over-Parameterization in Sequence Models
·2081 words·10 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Tsinghua University
Over-parameterized gradient descent dynamically adapts to signal structure, improving sequence model generalization and outperforming fixed-kernel methods.
Implicit Regularization Paths of Weighted Neural Representations
·1797 words·9 mins·
loading
·
loading
AI Theory
Generalization
π’ Carnegie Mellon University
Weighted pretrained features implicitly regularize models, and this paper reveals equivalent paths between weighting schemes and ridge regularization, enabling efficient hyperparameter tuning.
Graph Neural Networks and Arithmetic Circuits
·465 words·3 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Leibniz University Hanover
Graph Neural Networks’ (GNNs) computational power precisely mirrors that of arithmetic circuits, as proven via a novel C-GNN model; this reveals fundamental limits to GNN scalability.
Generalization of Hamiltonian algorithms
·344 words·2 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Istituto Italiano Di Tecnologia
New, tighter generalization bounds are derived for a class of stochastic learning algorithms that generate absolutely continuous probability distributions; enhancing our understanding of their perform…
Generalization Error Bounds for Two-stage Recommender Systems with Tree Structure
·386 words·2 mins·
loading
·
loading
AI Theory
Generalization
π’ University of Science and Technology of China
Two-stage recommender systems using tree structures achieve better generalization with more branches and harmonized training data distributions across stages.
Generalization Bounds via Conditional $f$-Information
·358 words·2 mins·
loading
·
loading
AI Theory
Generalization
π’ Tongji University
New information-theoretic generalization bounds, based on conditional f-information, improve existing methods by addressing unboundedness and offering a generic approach applicable to various loss fun…
Generalizablity of Memorization Neural Network
·1319 words·7 mins·
loading
·
loading
AI Theory
Generalization
π’ Chinese Academy of Sciences
Unlocking deep learning’s generalization mystery, this research pioneers a theoretical understanding of memorization neural network generalizability, revealing critical network structural requirements…
Explicit Eigenvalue Regularization Improves Sharpness-Aware Minimization
·2020 words·10 mins·
loading
·
loading
AI Theory
Generalization
π’ Monash University
Eigen-SAM significantly boosts generalization in deep learning by directly addressing SAM’s limitations through explicit top Hessian eigenvalue regularization.
Drift-Resilient TabPFN: In-Context Learning Temporal Distribution Shifts on Tabular Data
·5519 words·26 mins·
loading
·
loading
AI Generated
Machine Learning
Generalization
π’ Technical University of Munich
Drift-Resilient TabPFN masters temporal data shifts!
Dissecting the Failure of Invariant Learning on Graphs
·4452 words·21 mins·
loading
·
loading
AI Generated
AI Theory
Generalization
π’ Peking University
Cross-environment Intra-class Alignment (CIA) and its label-free variant, CIA-LRA, significantly improve node-level OOD generalization on graphs by aligning representations and eliminating spurious fe…
Dimension-free deterministic equivalents and scaling laws for random feature regression
·1898 words·9 mins·
loading
·
loading
AI Theory
Generalization
π’ Γcole Normale SupΓ©rieure
This work delivers dimension-free deterministic equivalents for random feature regression, revealing sharp excess error rates and scaling laws.