Skip to main content

Posters

2024

Non-asymptotic Convergence of Training Transformers for Next-token Prediction
·361 words·2 mins· loading · loading
AI Generated AI Theory Optimization 🏒 Penn State University
This paper reveals how a one-layer transformer’s training converges for next-token prediction, showing sub-linear convergence for both layers and shedding light on its surprising generalization abilit…
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
·2753 words·13 mins· loading · loading
AI Generated Machine Learning Optimization 🏒 Sorbonne Université
This paper rigorously analyzes biased adaptive stochastic gradient descent (SGD), proving convergence to critical points for non-convex functions even with biased gradient estimations. The analysis c…
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
·2513 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Rice University
NoMAD-Attention achieves up to 2x speedup in 4-bit quantized LLaMA inference on CPUs by replacing computationally expensive multiply-add operations with ultra-low-latency in-register lookups.
Noisy Dual Mirror Descent: A Near Optimal Algorithm for Jointly-DP Convex Resource Allocation
·2148 words·11 mins· loading · loading
AI Generated AI Theory Privacy 🏒 Nanyang Business School, Nanyang Technological University
Near-optimal algorithm for private resource allocation is introduced, achieving improved accuracy and privacy guarantees.
NoiseGPT: Label Noise Detection and Rectification through Probability Curvature
·2389 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Beijing Institute of Technology
NoiseGPT uses multi-modal LLMs to detect & fix noisy image labels by identifying probability curvature differences between clean and noisy examples.
Noise-Aware Differentially Private Regression via Meta-Learning
·3336 words·16 mins· loading · loading
AI Generated AI Theory Privacy 🏒 University of Helsinki
Meta-learning and differential privacy combine to enable accurate, well-calibrated private regression, even with limited data, via the novel DPConvCNP model.
Noise Contrastive Alignment of Language Models with Explicit Rewards
·2166 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Tsinghua University
This paper introduces InfoNCA and NCA, novel frameworks for language model alignment using noise contrastive estimation, enabling direct optimization from both explicit rewards and pairwise preference…
Noether's Razor: Learning Conserved Quantities
·2052 words·10 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Imperial College London
Noether’s Razor learns conserved quantities and symmetries directly from data via Bayesian model selection, improving dynamical systems modeling accuracy and generalizability.
No-Regret M${}^{ atural}$-Concave Function Maximization: Stochastic Bandit Algorithms and NP-Hardness of Adversarial Full-Information Setting
·1615 words·8 mins· loading · loading
AI Generated AI Theory Optimization 🏒 Hokkaido University
This paper reveals efficient stochastic bandit algorithms for maximizing M-concave functions and proves NP-hardness for adversarial full-information settings.
No-Regret Learning for Fair Multi-Agent Social Welfare Optimization
·277 words·2 mins· loading · loading
AI Theory Fairness 🏒 University of Iowa
This paper solves the open problem of achieving no-regret learning in online multi-agent Nash social welfare maximization.
No-Regret Bandit Exploration based on Soft Tree Ensemble Model
·1480 words·7 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 LY Corporation
A novel stochastic bandit algorithm using soft tree ensemble models achieves lower cumulative regret than existing ReLU-based neural bandit algorithms, offering a constrained yet effective hypothesis …
No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations
·3800 words·18 mins· loading · loading
Computer Vision Image Classification 🏒 QUVA Lab, University of Amsterdam
Self-supervised gradients boost frozen deep learning model performance!
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
·5380 words·26 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 CLAIRE, EPFL
Deep RL agents trained under non-stationarity suffer performance collapse due to representation degradation; this work reveals this in PPO and introduces Proximal Feature Optimization (PFO) to mitigat…
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
·4811 words·23 mins· loading · loading
AI Generated Machine Learning Reinforcement Learning 🏒 University of Oxford
AI agents learn better with well-designed training environments. This paper reveals flaws in current environment-selection methods and introduces Sampling for Learnability (SFL), a new approach that …
No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation
·532 words·3 mins· loading · loading
AI Generated AI Theory Optimization 🏒 University of Birmingham
No free lunch for adversarial optimization: This paper proves that no single algorithm universally outperforms others when finding Nash Equilibrium, introducing black-box complexity analysis to estab…
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices
·3353 words·16 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Carnegie Mellon University
LLM watermarking faces inherent trade-offs; this paper reveals simple attacks exploiting common design choices, proposing guidelines and defenses for more secure systems.
No Free Delivery Service: Epistemic limits of passive data collection in complex social systems
·2178 words·11 mins· loading · loading
AI Theory Generalization 🏒 Meta AI
Passive data collection in complex social systems invalidates standard AI model validation; new methods are needed.
No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models
·2229 words·11 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏒 Google DeepMind
Contrastive vision-language models (VLMs) trained only on English data significantly underperform on culturally diverse benchmarks. This paper reveals this bias, proposes novel evaluation metrics, and…
No 'Zero-Shot' Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
·6344 words·30 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏒 University of Oxford
Multimodal models’ impressive ‘zero-shot’ performance hinges on the frequency of concepts in their training data, not inherent generalization ability; exponentially more data is needed for linear impr…
Nimbus: Secure and Efficient Two-Party Inference for Transformers
·3036 words·15 mins· loading · loading
AI Generated AI Theory Privacy 🏒 Shanghai Jiao Tong University
Nimbus achieves 2.7-4.7x speedup in BERT base inference using novel two-party computation techniques for efficient matrix multiplication and non-linear layer approximation.