🏢 University of Tokyo

Wide Two-Layer Networks can Learn from Adversarial Perturbations

26 September 2024·2045 words·10 mins· loading · loading

AI Theory Robustness 🏢 University of Tokyo

Wide two-layer neural networks can generalize well from mislabeled adversarial examples because adversarial perturbations surprisingly contain sufficient class-specific features.

Unveil Benign Overfitting for Transformer in Vision: Training Dynamics, Convergence, and Generalization

26 September 2024·1822 words·9 mins· loading · loading

AI Generated Computer Vision Vision Transformers 🏢 University of Tokyo

Vision Transformers (ViTs) generalize surprisingly well, even when overfitting training data; this work provides the first theoretical explanation by characterizing the optimization dynamics of ViTs a…

Understanding the Expressivity and Trainability of Fourier Neural Operator: A Mean-Field Perspective

26 September 2024·2537 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Tokyo

A mean-field theory explains Fourier Neural Operator (FNO) behavior, linking expressivity to trainability by identifying ordered and chaotic phases that correspond to vanishing or exploding gradients,…

Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective

26 September 2024·3071 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Tokyo

Linear probing then fine-tuning (LP-FT) significantly improves language model fine-tuning; this paper uses Neural Tangent Kernel (NTK) theory to explain why.

Transformers are Minimax Optimal Nonparametric In-Context Learners

26 September 2024·1461 words·7 mins· loading · loading

AI Generated Machine Learning Meta Learning 🏢 University of Tokyo

Transformers excel at in-context learning by leveraging minimax-optimal nonparametric learning, achieving near-optimal risk with sufficient pretraining data diversity.

Taming the Long Tail in Human Mobility Prediction

26 September 2024·2047 words·10 mins· loading · loading

AI Applications Smart Cities 🏢 University of Tokyo

LoTNext framework tackles human mobility prediction’s long-tail problem by using graph and loss adjustments to improve the accuracy of predicting less-visited locations.

Risk-sensitive control as inference with Rényi divergence

26 September 2024·1494 words·8 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 University of Tokyo

Risk-sensitive control is recast as inference using Rényi divergence, yielding new algorithms and revealing equivalences between seemingly disparate methods.

On the Minimax Regret for Contextual Linear Bandits and Multi-Armed Bandits with Expert Advice

26 September 2024·360 words·2 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 University of Tokyo

This paper provides novel algorithms and matching lower bounds for multi-armed bandits with expert advice and contextual linear bandits, resolving open questions and advancing theoretical understandin…

Large Language Models as Urban Residents: An LLM Agent Framework for Personal Mobility Generation

26 September 2024·2032 words·10 mins· loading · loading

AI Applications Smart Cities 🏢 University of Tokyo

LLM agents effectively generate realistic personal mobility patterns using semantically rich data.

Integrating GNN and Neural ODEs for Estimating Non-Reciprocal Two-Body Interactions in Mixed-Species Collective Motion

26 September 2024·1573 words·8 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Tokyo

Deep learning framework integrating GNNs and neural ODEs precisely estimates non-reciprocal two-body interactions in mixed-species collective motion, accurately replicating both individual and collect…

Geometric-Averaged Preference Optimization for Soft Preference Labels

26 September 2024·2987 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Tokyo

Improving LLM alignment, this paper introduces soft preference labels & geometric averaging in Direct Preference Optimization, consistently improving performance on standard benchmarks.

Generalization Bound and Learning Methods for Data-Driven Projections in Linear Programming

26 September 2024·1748 words·9 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Tokyo

Learn to project, solve faster! This paper introduces data-driven projections for solving high-dimensional linear programs, proving theoretical guarantees and demonstrating significant improvements in…

Generalizable and Animatable Gaussian Head Avatar

26 September 2024·3445 words·17 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Tokyo

One-shot animatable head avatar reconstruction is achieved using a novel dual-lifting method that generates 3D Gaussians from a single image, enabling real-time expression control and rendering with s…

Fast Rates in Stochastic Online Convex Optimization by Exploiting the Curvature of Feasible Sets

26 September 2024·1343 words·7 mins· loading · loading

AI Theory Optimization 🏢 University of Tokyo

This paper introduces a novel approach for fast rates in online convex optimization by exploiting the curvature of feasible sets, achieving logarithmic regret bounds under specific conditions.

Enriching Disentanglement: From Logical Definitions to Quantitative Metrics

26 September 2024·3435 words·17 mins· loading · loading

AI Theory Representation Learning 🏢 University of Tokyo

This paper presents a novel approach to deriving theoretically grounded disentanglement metrics by linking logical definitions to quantitative measures, offering strong theoretical guarantees and easi…

Dealing with Synthetic Data Contamination in Online Continual Learning

26 September 2024·2977 words·14 mins· loading · loading

Computer Vision Image Generation 🏢 University of Tokyo

AI-generated images contaminate online continual learning datasets, hindering performance. A new method, ESRM, leverages entropy and real/synthetic similarity maximization to select high-quality data…

Continuous Temporal Domain Generalization

26 September 2024·2639 words·13 mins· loading · loading

AI Generated Machine Learning Domain Generalization 🏢 University of Tokyo

Koodos: a novel Koopman operator-driven framework that tackles Continuous Temporal Domain Generalization (CTDG) by modeling continuous data dynamics and learning model evolution across irregular time …

ADOPT: Modified Adam Can Converge with Any $eta_2$ with the Optimal Rate

26 September 2024·1889 words·9 mins· loading · loading

Machine Learning Deep Learning 🏢 University of Tokyo

ADOPT, a novel adaptive gradient method, achieves optimal convergence rates without restrictive assumptions, unlike Adam, significantly improving deep learning optimization.

A Simple and Adaptive Learning Rate for FTRL in Online Learning with Minimax Regret of Θ(T^{2/3}) and its Application to Best-of-Both-Worlds

26 September 2024·334 words·2 mins· loading · loading

AI Theory Optimization 🏢 University of Tokyo

A new adaptive learning rate for FTRL achieves minimax regret of O(T²/³) in online learning, improving existing best-of-both-worlds algorithms for various hard problems.

A provable control of sensitivity of neural networks through a direct parameterization of the overall bi-Lipschitzness

26 September 2024·4589 words·22 mins· loading · loading

AI Theory Optimization 🏢 University of Tokyo

New framework directly controls neural network sensitivity by precisely parameterizing overall bi-Lipschitzness, offering improved robustness and generalization.