Skip to main content

🏢 École Polytechnique Fédérale De Lausanne

Towards a theory of how the structure of language is acquired by deep neural networks
·3238 words·16 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 École Polytechnique Fédérale De Lausanne
Deep learning models learn language structure through next-token prediction, but the data requirements remain unclear. This paper reveals that the effective context window, determining learning capaci…
Mean-Field Langevin Dynamics for Signed Measures via a Bilevel Approach
·350 words·2 mins· loading · loading
AI Theory Optimization 🏢 École Polytechnique Fédérale De Lausanne
This paper presents a novel bilevel approach to extend mean-field Langevin dynamics to solve convex optimization problems over signed measures, achieving stronger guarantees and faster convergence rat…
AdanCA: Neural Cellular Automata As Adaptors For More Robust Vision Transformer
·3672 words·18 mins· loading · loading
Computer Vision Image Classification 🏢 École Polytechnique Fédérale De Lausanne
Boosting Vision Transformer robustness against attacks & noisy data, AdaNCA uses Neural Cellular Automata as plug-and-play adaptors between ViT layers, achieving significant accuracy improvement with …