🏢 Princeton University

When Is Inductive Inference Possible?

26 September 2024·1470 words·7 mins· loading · loading

AI Theory Optimization 🏢 Princeton University

This paper provides a tight characterization of inductive inference, proving it’s possible if and only if the hypothesis class is a countable union of online learnable classes, resolving a long-standi…

Understanding the Limits of Vision Language Models Through the Lens of the Binding Problem

26 September 2024·2181 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Princeton University

Vision-language models struggle with multi-object reasoning due to the binding problem; this paper reveals human-like capacity limits in VLMs and proposes solutions.

Training Dynamics of Transformers to Recognize Word Co-occurrence via Gradient Flow Analysis

26 September 2024·426 words·2 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

Researchers reveal how transformers learn word co-occurrence using a novel gradient flow analysis, uncovering a two-phase training process that leads to near-minimum loss and improved model performanc…

Tight Rates for Bandit Control Beyond Quadratics

26 September 2024·406 words·2 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Princeton University

This paper presents an algorithm achieving Õ(√T) optimal regret for bandit non-stochastic control with strongly-convex and smooth cost functions, overcoming prior limitations of suboptimal bounds.

The Road Less Scheduled

26 September 2024·2275 words·11 mins· loading · loading

Optimization 🏢 Princeton University

Revolutionizing machine learning, Schedule-Free optimization achieves state-of-the-art results without needing learning rate schedules, simplifying training and improving efficiency.

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

26 September 2024·10845 words·51 mins· loading · loading

AI Applications Security 🏢 Princeton University

SWE-agent achieves state-of-the-art performance on software engineering benchmarks by creating a custom agent-computer interface that enhances LM agents’ ability to use computers.

SureMap: Simultaneous mean estimation for single-task and multi-task disaggregated evaluation

26 September 2024·2443 words·12 mins· loading · loading

AI Theory Fairness 🏢 Princeton University

SureMap, a new method, significantly boosts accuracy in single and multi-task disaggregated evaluations of AI models using limited data by transforming the problem into Gaussian mean estimation and cl…

SimPO: Simple Preference Optimization with a Reference-Free Reward

26 September 2024·3091 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

SimPO: a simpler, reference-free reward algorithm significantly outperforming existing offline preference optimization methods, achieving higher accuracy and efficiency in aligning LLMs with human pre…

Probabilistic Federated Prompt-Tuning with Non-IID and Imbalanced Data

26 September 2024·2066 words·10 mins· loading · loading

Machine Learning Federated Learning 🏢 Princeton University

Probabilistic Federated Prompt Tuning (PFPT) significantly improves federated learning accuracy on heterogeneous and imbalanced data by using a probabilistic model for prompt aggregation, outperformin…

Optimal Aggregation of Prediction Intervals under Unsupervised Domain Shift

26 September 2024·1968 words·10 mins· loading · loading

AI Generated Machine Learning Transfer Learning 🏢 Princeton University

This paper introduces a novel method for creating highly accurate and narrow prediction intervals even when data distribution shifts unexpectedly, significantly improving machine learning model reliab…

One-Layer Transformer Provably Learns One-Nearest Neighbor In Context

26 September 2024·1344 words·7 mins· loading · loading

AI Theory Optimization 🏢 Princeton University

One-layer transformers provably learn the one-nearest neighbor prediction rule, offering theoretical insights into their in-context learning capabilities.

Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit

26 September 2024·455 words·3 mins· loading · loading

AI Generated AI Theory Generalization 🏢 Princeton University

SGD can train neural networks to learn low-dimensional polynomials near the information-theoretic limit, surpassing previous correlational statistical query lower bounds.

Low-Rank Optimal Transport through Factor Relaxation with Latent Coupling

26 September 2024·2606 words·13 mins· loading · loading

Machine Learning Optimization 🏢 Princeton University

FRLC: a novel algorithm for low-rank optimal transport using latent coupling, enabling faster computation and better interpretability for diverse applications.

Learning Human-like Representations to Enable Learning Human Values

26 September 2024·2442 words·12 mins· loading · loading

AI Theory Representation Learning 🏢 Princeton University

Aligning AI’s world representation with humans enables faster, safer learning of human values, improving both exploration and generalization.

Learning and Transferring Sparse Contextual Bigrams with Linear Transformers

26 September 2024·1445 words·7 mins· loading · loading

Natural Language Processing Text Generation 🏢 Princeton University

Linear transformers efficiently learn sparse contextual bigrams by leveraging both in-context and global information, achieving polynomial sample complexity.

Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference

26 September 2024·2061 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

Kraken: A new Transformer architecture boosts multi-device inference speed by 35.6% by cleverly overlapping communication with computation.

Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity

26 September 2024·2643 words·13 mins· loading · loading

AI Applications Robotics 🏢 Princeton University

Robots using LLMs for task planning often make unsafe or wrong decisions due to LLM hallucination and ambiguity in instructions. This paper introduces ‘introspective planning,’ a novel method that us…

Inference via Interpolation: Contrastive Representations Provably Enable Planning and Inference

26 September 2024·1693 words·8 mins· loading · loading

AI Theory Representation Learning 🏢 Princeton University

Contrastive learning enables efficient probabilistic inference in high-dimensional time series by creating Gaussian representations that form a Gauss-Markov chain, allowing for closed-form solutions t…

GREATS: Online Selection of High-Quality Data for LLM Training in Every Iteration

26 September 2024·1719 words·9 mins· loading · loading

Large Language Models 🏢 Princeton University

GREATS: a novel online batch selection method significantly speeds up LLM training by greedily selecting high-quality data batches in every iteration, improving both convergence and generalization per…

Gradient Guidance for Diffusion Models: An Optimization Perspective

26 September 2024·2233 words·11 mins· loading · loading

AI Theory Optimization 🏢 Princeton University

This paper provides a novel optimization framework for guided diffusion models, proving Õ(1/K) convergence for concave objective functions and demonstrating structure-preserving guidance.