Posters
2024
Non-asymptotic Convergence of Training Transformers for Next-token Prediction
·361 words·2 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
π’ Penn State University
This paper reveals how a one-layer transformer’s training converges for next-token prediction, showing sub-linear convergence for both layers and shedding light on its surprising generalization abilit…
Non-asymptotic Analysis of Biased Adaptive Stochastic Approximation
·2753 words·13 mins·
loading
·
loading
AI Generated
Machine Learning
Optimization
π’ Sorbonne UniversitΓ©
This paper rigorously analyzes biased adaptive stochastic gradient descent (SGD), proving convergence to critical points for non-convex functions even with biased gradient estimations. The analysis c…
NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention
·2513 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Rice University
NoMAD-Attention achieves up to 2x speedup in 4-bit quantized LLaMA inference on CPUs by replacing computationally expensive multiply-add operations with ultra-low-latency in-register lookups.
Noisy Dual Mirror Descent: A Near Optimal Algorithm for Jointly-DP Convex Resource Allocation
·2148 words·11 mins·
loading
·
loading
AI Generated
AI Theory
Privacy
π’ Nanyang Business School, Nanyang Technological University
Near-optimal algorithm for private resource allocation is introduced, achieving improved accuracy and privacy guarantees.
NoiseGPT: Label Noise Detection and Rectification through Probability Curvature
·2389 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Beijing Institute of Technology
NoiseGPT uses multi-modal LLMs to detect & fix noisy image labels by identifying probability curvature differences between clean and noisy examples.
Noise-Aware Differentially Private Regression via Meta-Learning
·3336 words·16 mins·
loading
·
loading
AI Generated
AI Theory
Privacy
π’ University of Helsinki
Meta-learning and differential privacy combine to enable accurate, well-calibrated private regression, even with limited data, via the novel DPConvCNP model.
Noise Contrastive Alignment of Language Models with Explicit Rewards
·2166 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Tsinghua University
This paper introduces InfoNCA and NCA, novel frameworks for language model alignment using noise contrastive estimation, enabling direct optimization from both explicit rewards and pairwise preference…
Noether's Razor: Learning Conserved Quantities
·2052 words·10 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Imperial College London
Noether’s Razor learns conserved quantities and symmetries directly from data via Bayesian model selection, improving dynamical systems modeling accuracy and generalizability.
No-Regret M${}^{
atural}$-Concave Function Maximization: Stochastic Bandit Algorithms and NP-Hardness of Adversarial Full-Information Setting
·1615 words·8 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
π’ Hokkaido University
This paper reveals efficient stochastic bandit algorithms for maximizing M-concave functions and proves NP-hardness for adversarial full-information settings.
No-Regret Learning for Fair Multi-Agent Social Welfare Optimization
·277 words·2 mins·
loading
·
loading
AI Theory
Fairness
π’ University of Iowa
This paper solves the open problem of achieving no-regret learning in online multi-agent Nash social welfare maximization.
No-Regret Bandit Exploration based on Soft Tree Ensemble Model
·1480 words·7 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ LY Corporation
A novel stochastic bandit algorithm using soft tree ensemble models achieves lower cumulative regret than existing ReLU-based neural bandit algorithms, offering a constrained yet effective hypothesis …
No Train, all Gain: Self-Supervised Gradients Improve Deep Frozen Representations
·3800 words·18 mins·
loading
·
loading
Computer Vision
Image Classification
π’ QUVA Lab, University of Amsterdam
Self-supervised gradients boost frozen deep learning model performance!
No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO
·5380 words·26 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ CLAIRE, EPFL
Deep RL agents trained under non-stationarity suffer performance collapse due to representation degradation; this work reveals this in PPO and introduces Proximal Feature Optimization (PFO) to mitigat…
No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery
·4811 words·23 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
π’ University of Oxford
AI agents learn better with well-designed training environments. This paper reveals flaws in current environment-selection methods and introduces Sampling for Learnability (SFL), a new approach that …
No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation
·532 words·3 mins·
loading
·
loading
AI Generated
AI Theory
Optimization
π’ University of Birmingham
No free lunch for adversarial optimization: This paper proves that no single algorithm universally outperforms others when finding Nash Equilibrium, introducing black-box complexity analysis to estab…
No Free Lunch in LLM Watermarking: Trade-offs in Watermarking Design Choices
·3353 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Carnegie Mellon University
LLM watermarking faces inherent trade-offs; this paper reveals simple attacks exploiting common design choices, proposing guidelines and defenses for more secure systems.
No Free Delivery Service: Epistemic limits of passive data collection in complex social systems
·2178 words·11 mins·
loading
·
loading
AI Theory
Generalization
π’ Meta AI
Passive data collection in complex social systems invalidates standard AI model validation; new methods are needed.
No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models
·2229 words·11 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
π’ Google DeepMind
Contrastive vision-language models (VLMs) trained only on English data significantly underperform on culturally diverse benchmarks. This paper reveals this bias, proposes novel evaluation metrics, and…
No 'Zero-Shot' Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
·6344 words·30 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
π’ University of Oxford
Multimodal models’ impressive ‘zero-shot’ performance hinges on the frequency of concepts in their training data, not inherent generalization ability; exponentially more data is needed for linear impr…
Nimbus: Secure and Efficient Two-Party Inference for Transformers
·3036 words·15 mins·
loading
·
loading
AI Generated
AI Theory
Privacy
π’ Shanghai Jiao Tong University
Nimbus achieves 2.7-4.7x speedup in BERT base inference using novel two-party computation techniques for efficient matrix multiplication and non-linear layer approximation.