🏢 Google DeepMind

Fractal Patterns May Illuminate the Success of Next-Token Prediction

26 September 2024·2223 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

LLMs’ success is explained by the self-similar, long-range dependent fractal structure of language; small-scale patterns reflect larger ones.

Foundations of Multivariate Distributional Reinforcement Learning

26 September 2024·1558 words·8 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Google DeepMind

First oracle-free, computationally tractable algorithms for provably convergent multivariate distributional RL are introduced, achieving convergence rates matching scalar settings and offering insight…

FlexCap: Describe Anything in Images in Controllable Detail

26 September 2024·2861 words·14 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Google DeepMind

FlexCap generates controllable, region-specific image descriptions of varying lengths, achieving state-of-the-art zero-shot visual question answering.

FineStyle: Fine-grained Controllable Style Personalization for Text-to-image Models

26 September 2024·2833 words·14 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Google DeepMind

FineStyle enables fine-grained controllable style personalization for text-to-image models using a novel concept-oriented data scaling and parameter-efficient adapter tuning, mitigating content leakag…

Fast Tree-Field Integrators: From Low Displacement Rank to Topological Transformers

26 September 2024·3010 words·15 mins· loading · loading

AI Generated AI Theory Optimization 🏢 Google DeepMind

Fast Tree-Field Integrators (FTFIs) revolutionize graph processing by enabling polylog-linear time computation for integrating tensor fields on trees, providing significant speedups for various machin…

EM Distillation for One-step Diffusion Models

26 September 2024·3404 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 Google DeepMind

EM Distillation (EMD) efficiently trains one-step diffusion models by using an Expectation-Maximization approach, achieving state-of-the-art image generation quality and outperforming existing methods…

Efficient Sketches for Training Data Attribution and Studying the Loss Landscape

26 September 2024·3015 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Google DeepMind

Novel sketching algorithms enable scalable gradient and Hessian analysis for large language models, revealing insights into their intrinsic dimensionality and challenging existing assumptions.

Decoupling Semantic Similarity from Spatial Alignment for Neural Networks.

26 September 2024·2318 words·11 mins· loading · loading

Computer Vision Representation Learning 🏢 Google DeepMind

Researchers developed semantic RSMs, a novel approach to measure semantic similarity in neural networks, improving image retrieval and aligning network representations with predicted class probabiliti…

Chain-of-Thought Reasoning Without Prompting

26 September 2024·2324 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

LLMs can reason effectively without prompting by simply adjusting the decoding process to reveal inherent chain-of-thought paths.

CAT3D: Create Anything in 3D with Multi-View Diffusion Models

26 September 2024·1770 words·9 mins· loading · loading

3D Vision 🏢 Google DeepMind

CAT3D: Generate high-quality 3D scenes from as little as one image using a novel multi-view diffusion model, outperforming existing methods in speed and quality.

Amortized Planning with Large-Scale Transformers: A Case Study on Chess

26 September 2024·3346 words·16 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 Google DeepMind

Large-scale transformers achieve grandmaster-level chess play via supervised learning on a new 10M game benchmark dataset, demonstrating impressive generalization beyond memorization.