Posters

Non-asymptotic Convergence of Training Transformers for Next-token Prediction

26 September 2024·361 words·2 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Penn State University

This paper reveals how a one-layer transformer’s training converges for next-token prediction, showing sub-linear convergence for both layers and shedding light on its surprising generalization abilit…

Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation

26 September 2024·2753 words·13 mins· loading · loading

AI Generated Machine Learning Optimization 🏢 Sorbonne Université

This paper rigorously analyzes biased adaptive stochastic gradient descent (SGD), proving convergence to critical points for non-convex functions even with biased gradient estimations. The analysis c…

NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention

26 September 2024·2513 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Rice University

NoMAD-Attention achieves up to 2x speedup in 4-bit quantized LLaMA inference on CPUs by replacing computationally expensive multiply-add operations with ultra-low-latency in-register lookups.

Noisy Dual Mirror Descent: A Near Optimal Algorithm for Jointly-DP Convex Resource Allocation

26 September 2024·2148 words·11 mins· loading · loading

AI Generated AI Theory Privacy 🏢 Nanyang Business School, Nanyang Technological University

Near-optimal algorithm for private resource allocation is introduced, achieving improved accuracy and privacy guarantees.

NoiseGPT: Label Noise Detection and Rectification through Probability Curvature

26 September 2024·2389 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Beijing Institute of Technology

NoiseGPT uses multi-modal LLMs to detect & fix noisy image labels by identifying probability curvature differences between clean and noisy examples.

Noise-Aware Differentially Private Regression via Meta-Learning

26 September 2024·3336 words·16 mins· loading · loading

AI Generated AI Theory Privacy 🏢 University of Helsinki

Meta-learning and differential privacy combine to enable accurate, well-calibrated private regression, even with limited data, via the novel DPConvCNP model.

Noise Contrastive Alignment of Language Models with Explicit Rewards

26 September 2024·2166 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

This paper introduces InfoNCA and NCA, novel frameworks for language model alignment using noise contrastive estimation, enabling direct optimization from both explicit rewards and pairwise preference…

Noether's Razor: Learning Conserved Quantities

26 September 2024·2052 words·10 mins· loading · loading

AI Generated Machine Learning Deep Learning 🏢 Imperial College London

Noether’s Razor learns conserved quantities and symmetries directly from data via Bayesian model selection, improving dynamical systems modeling accuracy and generalizability.

No-Regret M${}^{ atural}$-Concave Function Maximization: Stochastic Bandit Algorithms and NP-Hardness of Adversarial Full-Information Setting

26 September 2024·1615 words·8 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Hokkaido University

This paper reveals efficient stochastic bandit algorithms for maximizing M-concave functions and proves NP-hardness for adversarial full-information settings.

No-Regret Learning for Fair Multi-Agent Social Welfare Optimization

26 September 2024·277 words·2 mins· loading · loading

AI Theory Fairness 🏢 University of Iowa

This paper solves the open problem of achieving no-regret learning in online multi-agent Nash social welfare maximization.

No-Regret Bandit Exploration based on Soft Tree Ensemble Model

26 September 2024·1480 words·7 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 LY Corporation

A novel stochastic bandit algorithm using soft tree ensemble models achieves lower cumulative regret than existing ReLU-based neural bandit algorithms, offering a constrained yet effective hypothesis …

No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations

26 September 2024·3800 words·18 mins· loading · loading

Computer Vision Image Classification 🏢 QUVA Lab, University of Amsterdam

Self-supervised gradients boost frozen deep learning model performance!

No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO

26 September 2024·5380 words·26 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 CLAIRE, EPFL

Deep RL agents trained under non-stationarity suffer performance collapse due to representation degradation; this work reveals this in PPO and introduces Proximal Feature Optimization (PFO) to mitigat…

No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery

26 September 2024·4811 words·23 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 University of Oxford

AI agents learn better with well-designed training environments. This paper reveals flaws in current environment-selection methods and introduces Sampling for Learnability (SFL), a new approach that …

No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation

26 September 2024·532 words·3 mins· loading · loading

AI Generated AI Theory Optimization 🏢 University of Birmingham

No free lunch for adversarial optimization: This paper proves that no single algorithm universally outperforms others when finding Nash Equilibrium, introducing black-box complexity analysis to estab…

No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices

26 September 2024·3353 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

LLM watermarking faces inherent trade-offs; this paper reveals simple attacks exploiting common design choices, proposing guidelines and defenses for more secure systems.

No Free Delivery Service: Epistemic limits of passive data collection in complex social systems

26 September 2024·2178 words·11 mins· loading · loading

AI Theory Generalization 🏢 Meta AI

Passive data collection in complex social systems invalidates standard AI model validation; new methods are needed.

No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models

26 September 2024·2229 words·11 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 Google DeepMind

Contrastive vision-language models (VLMs) trained only on English data significantly underperform on culturally diverse benchmarks. This paper reveals this bias, proposes novel evaluation metrics, and…

No 'Zero-Shot' Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance

26 September 2024·6344 words·30 mins· loading · loading

AI Generated Multimodal Learning Vision-Language Models 🏢 University of Oxford

Multimodal models’ impressive ‘zero-shot’ performance hinges on the frequency of concepts in their training data, not inherent generalization ability; exponentially more data is needed for linear impr…

Nimbus: Secure and Efficient Two-Party Inference for Transformers

26 September 2024·3036 words·15 mins· loading · loading

AI Generated AI Theory Privacy 🏢 Shanghai Jiao Tong University

Nimbus achieves 2.7-4.7x speedup in BERT base inference using novel two-party computation techniques for efficient matrix multiplication and non-linear layer approximation.