π’ Yale University
Unveiling Induction Heads: Provable Training Dynamics and Feature Learning in Transformers
·2178 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Yale University
Transformers learn complex tasks surprisingly well through in-context learning, but the mechanism remains unclear. This paper proves that a two-layer transformer trained on n-gram Markov chain data co…
Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
·1948 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Yale University
TAP: automated jailbreaking of black-box LLMs with high success rates, using fewer queries than previous methods.
Transformation-Invariant Learning and Theoretical Guarantees for OOD Generalization
·541 words·3 mins·
loading
·
loading
AI Theory
Generalization
π’ Yale University
This paper introduces a novel theoretical framework for robust machine learning under distribution shifts, offering learning rules and guarantees, highlighting the game-theoretic viewpoint of distribu…
Solving Inverse Problems via Diffusion Optimal Control
·2106 words·10 mins·
loading
·
loading
AI Theory
Optimization
π’ Yale University
Revolutionizing inverse problem solving, this paper introduces diffusion optimal control, a novel framework converting signal recovery into a discrete optimal control problem, surpassing limitations o…
RSA: Resolving Scale Ambiguities in Monocular Depth Estimators through Language Descriptions
·2064 words·10 mins·
loading
·
loading
Natural Language Processing
Vision-Language Models
π’ Yale University
RSA: Language unlocks metric depth from single images!
Provable Partially Observable Reinforcement Learning with Privileged Information
·452 words·3 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Yale University
This paper provides the first provable efficiency guarantees for practically-used RL algorithms leveraging privileged information, addressing limitations of previous empirical paradigms and opening ne…
On Tractable $β©hi$-Equilibria in Non-Concave Games
·1428 words·7 mins·
loading
·
loading
AI Theory
Optimization
π’ Yale University
This paper presents efficient algorithms for approximating equilibria in non-concave games, focusing on tractable ΙΈ-equilibria and addressing computational challenges posed by infinite strategy sets.
On the Role of Information Structure in Reinforcement Learning for Partially-Observable Sequential Teams and Games
·2014 words·10 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
π’ Yale University
New reinforcement learning model clarifies the role of information structure in partially-observable sequential decision-making problems, proving an upper bound on learning complexity.
On the Computational Landscape of Replicable Learning
·348 words·2 mins·
loading
·
loading
AI Theory
Optimization
π’ Yale University
This paper reveals surprising computational connections between algorithmic replicability and other learning paradigms, offering novel algorithms and demonstrating separations between replicability an…
Nonlinear dynamics of localization in neural receptive fields
·1762 words·9 mins·
loading
·
loading
Unsupervised Learning
π’ Yale University
Neural receptive fields’ localization emerges from nonlinear learning dynamics driven by naturalistic data’s higher-order statistics, not just sparsity.
Injecting Undetectable Backdoors in Obfuscated Neural Networks and Language Models
·372 words·2 mins·
loading
·
loading
AI Theory
Robustness
π’ Yale University
Researchers developed a novel method to inject undetectable backdoors into obfuscated neural networks and language models, even with white-box access, posing significant security risks.
Inference of Neural Dynamics Using Switching Recurrent Neural Networks
·2472 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
π’ Yale University
SRNNs reveal behaviorally-relevant neural dynamics switches!
From Similarity to Superiority: Channel Clustering for Time Series Forecasting
·4001 words·19 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
π’ Yale University
Channel Clustering Module (CCM) boosts time series forecasting accuracy by intelligently grouping similar channels, improving model performance and generalization.