🏢 Google DeepMind
On scalable oversight with weak LLMs judging strong LLMs
·5158 words·25 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Google DeepMind
Weak LLMs can accurately supervise strong LLMs via debate, outperforming simpler consultancy methods, especially in information-asymmetric tasks.
Normalization and effective learning rates in reinforcement learning
·2714 words·13 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Google DeepMind
Normalize-and-Project (NaP) boosts reinforcement learning by stabilizing layer normalization, preventing plasticity loss, and enabling effective learning rate control.
Non-Stationary Learning of Neural Networks with Automatic Soft Parameter Reset
·4994 words·24 mins·
loading
·
loading
AI Generated
Machine Learning
Reinforcement Learning
🏢 Google DeepMind
AI models struggle with changing data; this paper introduces Soft Resets, a novel learning approach that uses an adaptive drift to gracefully guide parameters toward initialization, improving adaptabi…
No Filter: Cultural and Socioeconomic Diversity in Contrastive Vision-Language Models
·2229 words·11 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Google DeepMind
Contrastive vision-language models (VLMs) trained only on English data significantly underperform on culturally diverse benchmarks. This paper reveals this bias, proposes novel evaluation metrics, and…
Neural Assets: 3D-Aware Multi-Object Scene Synthesis with Image Diffusion Models
·2318 words·11 mins·
loading
·
loading
Image Generation
🏢 Google DeepMind
Neural Assets enables intuitive 3D multi-object scene editing via image diffusion models by using per-object representations to control individual object poses, achieving state-of-the-art results.
Neglected Hessian component explains mysteries in sharpness regularization
·1889 words·9 mins·
loading
·
loading
🏢 Google DeepMind
Deep learning’s mysteries surrounding sharpness regularization are solved by uncovering the crucial role of the neglected Hessian component, the Nonlinear Modeling Error (NME).
Near-Minimax-Optimal Distributional Reinforcement Learning with a Generative Model
·1906 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Google DeepMind
New distributional RL algorithm (DCFP) achieves near-minimax optimality for return distribution estimation in the generative model regime.
Multistep Distillation of Diffusion Models via Moment Matching
·2156 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Google DeepMind
New method distills slow diffusion models into fast, few-step models by matching data expectations, achieving state-of-the-art results on ImageNet.
Moving Off-the-Grid: Scene-Grounded Video Representations
·2151 words·11 mins·
loading
·
loading
Video Understanding
🏢 Google DeepMind
MooG: Self-supervised video model learns off-the-grid representations, enabling consistent scene element tracking even with motion; outperforming grid-based baselines on various vision tasks.
Mind the Graph When Balancing Data for Fairness or Robustness
·1841 words·9 mins·
loading
·
loading
AI Theory
Fairness
🏢 Google DeepMind
Data balancing in machine learning can hurt fairness and robustness; this paper reveals when and why, offering solutions for safer AI.
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving
·2343 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google DeepMind
LLMs gain math skills via prompt-guided skill labeling and exemplar selection, significantly boosting accuracy.
Many-Shot In-Context Learning
·3209 words·16 mins·
loading
·
loading
Large Language Models
🏢 Google DeepMind
Scaling up in-context learning using thousands of examples significantly boosts Large Language Model (LLM) performance, particularly for complex tasks. Novel training methods mitigate reliance on hum…
MambaLRP: Explaining Selective State Space Sequence Models
·3148 words·15 mins·
loading
·
loading
AI Theory
Interpretability
🏢 Google DeepMind
MambaLRP enhances explainability of Mamba sequence models by ensuring faithful relevance propagation, achieving state-of-the-art explanation performance, and uncovering model biases.
Long-form factuality in large language models
·4779 words·23 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google DeepMind
LLMs often generate factually inaccurate long-form text. This work introduces LongFact, a new benchmark dataset of 2280 fact-seeking prompts, and SAFE, a novel automated evaluation method that outperf…
LocCa: Visual Pretraining with Location-aware Captioners
·2114 words·10 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Google DeepMind
LocCa, a novel visual pretraining paradigm, uses location-aware captioning tasks to boost downstream localization performance while maintaining holistic task capabilities.
Learning Successor Features the Simple Way
·9069 words·43 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Google DeepMind
Learn deep Successor Features (SFs) directly from pixels, efficiently and without representation collapse, using a novel, simple method combining TD and reward prediction loss!
Learning rigid-body simulators over implicit shapes for large-scale scenes and vision
·2932 words·14 mins·
loading
·
loading
AI Applications
Robotics
🏢 Google DeepMind
SDF-Sim: A novel learned rigid-body simulator that leverages SDFs to achieve unprecedented scalability, enabling simulations with hundreds of objects and millions of nodes.
Improving Sparse Decomposition of Language Model Activations with Gated Sparse Autoencoders
·4021 words·19 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google DeepMind
Gated Sparse Autoencoders (GSAEs) achieve Pareto improvement over baseline SAEs for unsupervised feature discovery in language models, resolving the shrinkage bias of L1 penalty by separating feature …
Imitating Language via Scalable Inverse Reinforcement Learning
·3278 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Google DeepMind
This study presents a novel Inverse Reinforcement Learning (IRL) approach for fine-tuning large language models, offering improved performance and generation diversity compared to standard methods.
Generative Hierarchical Materials Search
·1856 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Google DeepMind
Generative Hierarchical Materials Search (GenMS) uses AI to design novel crystal structures from natural language descriptions, outperforming prior methods in both fulfilling user requests and finding…