Generalization

Deep Homomorphism Networks

26 September 2024·1657 words·8 mins· loading · loading

AI Theory Generalization 🏢 Roku, Inc.

Deep Homomorphism Networks (DHNs) boost graph neural network (GNN) expressiveness by efficiently detecting subgraph patterns using a novel graph homomorphism layer.

Credal Learning Theory

26 September 2024·2051 words·10 mins· loading · loading

AI Generated AI Theory Generalization 🏢 University of Manchester

Credal Learning Theory uses convex sets of probabilities to model data distribution variability, providing theoretical risk bounds for machine learning models in dynamic environments.

Controlling Multiple Errors Simultaneously with a PAC-Bayes Bound

26 September 2024·547 words·3 mins· loading · loading

AI Generated AI Theory Generalization 🏢 University College London

New PAC-Bayes bound controls multiple error types simultaneously, providing richer generalization guarantees.

Continual learning with the neural tangent ensemble

26 September 2024·1983 words·10 mins· loading · loading

AI Theory Generalization 🏢 Cold Spring Harbor Laboratory

Neural networks, viewed as Bayesian ensembles of fixed classifiers, enable continual learning without forgetting; posterior updates mirror stochastic gradient descent, offering insights into optimizat…

Compositional PAC-Bayes: Generalization of GNNs with persistence and beyond

26 September 2024·2208 words·11 mins· loading · loading

AI Theory Generalization 🏢 ETH Zurich

Novel compositional PAC-Bayes framework delivers data-dependent generalization bounds for persistence-enhanced Graph Neural Networks, improving model design and performance.

Can neural operators always be continuously discretized?

26 September 2024·380 words·2 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Shimane University

Neural operators’ continuous discretization is proven impossible in general Hilbert spaces, but achievable using strongly monotone operators, opening new avenues for numerical methods in scientific ma…

Bridging Multicalibration and Out-of-distribution Generalization Beyond Covariate Shift

26 September 2024·1648 words·8 mins· loading · loading

AI Theory Generalization 🏢 Tsinghua University

New model-agnostic framework for out-of-distribution generalization uses multicalibration across overlapping groups, showing improved robustness and prediction under various distribution shifts.

Benign overfitting in leaky ReLU networks with moderate input dimension

26 September 2024·366 words·2 mins· loading · loading

AI Theory Generalization 🏢 University of California, Los Angeles

Leaky ReLU networks exhibit benign overfitting under surprisingly relaxed conditions: input dimension only needs to linearly scale with sample size, challenging prior assumptions in the field.

Back to the Continuous Attractor

26 September 2024·5636 words·27 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Champalimaud Centre for the Unknown

Despite their brittleness, continuous attractors remain functionally robust analog memory models due to persistent slow manifolds surviving bifurcations, enabling accurate approximation and generaliza…

Almost Surely Asymptotically Constant Graph Neural Networks

26 September 2024·1976 words·10 mins· loading · loading

AI Theory Generalization 🏢 University of Oxford

Many graph neural networks (GNNs) surprisingly converge to constant outputs with increasing graph size, limiting their expressiveness.

A generalized neural tangent kernel for surrogate gradient learning

26 September 2024·1667 words·8 mins· loading · loading

AI Theory Generalization 🏢 University of Bern

Researchers introduce a generalized neural tangent kernel for analyzing surrogate gradient learning in neural networks with non-differentiable activation functions, providing a strong theoretical foun…

A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression

26 September 2024·2395 words·12 mins· loading · loading

AI Theory Generalization 🏢 University of Basel

This study provides a unified theory for kernel ridge regression’s learning curve, improving existing bounds and validating the Gaussian Equivalence Property under minimal assumptions.