🏢 Google DeepMind

What type of inference is planning?

26 September 2024·1424 words·7 mins· loading · loading

AI Theory Optimization 🏢 Google DeepMind

Planning is redefined as a distinct inference type within a variational framework, enabling efficient approximate planning in complex environments.

Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach

26 September 2024·2153 words·11 mins· loading · loading

Computer Vision Visual Question Answering 🏢 Google DeepMind

LLM-powered data curation boosts web-scale visual entity recognition!

UQE: A Query Engine for Unstructured Databases

26 September 2024·1692 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

UQE: A novel query engine uses LLMs for efficient and accurate unstructured data analytics, surpassing existing methods.

Understanding Visual Feature Reliance through the Lens of Complexity

26 September 2024·3993 words·19 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 Google DeepMind

Deep learning models favor simple features, hindering generalization; this paper introduces a new feature complexity metric revealing a spectrum of simple-to-complex features, their learning dynamics,…

Understanding Transformers via N-Gram Statistics

26 September 2024·3310 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

LLMs’ inner workings remain elusive. This study uses N-gram statistics to approximate transformer predictions, revealing how LLMs learn from simple to complex statistical rules, and how model variance…

Towards Estimating Bounds on the Effect of Policies under Unobserved Confounding

26 September 2024·1610 words·8 mins· loading · loading

AI Theory Causality 🏢 Google DeepMind

This paper presents a novel framework for estimating bounds on policy effects under unobserved confounding, offering tighter bounds and robust estimators for higher-dimensional data.

To Believe or Not to Believe Your LLM: IterativePrompting for Estimating Epistemic Uncertainty

26 September 2024·1940 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

This paper introduces an innovative iterative prompting method for estimating epistemic uncertainty in LLMs, enabling reliable detection of hallucinations.

Time-Reversal Provides Unsupervised Feedback to LLMs

26 September 2024·2584 words·13 mins· loading · loading

Large Language Models 🏢 Google DeepMind

Time-reversed language models provide unsupervised feedback for improving LLMs, offering a cost-effective alternative to human feedback and enhancing LLM safety.

The Group Robustness is in the Details: Revisiting Finetuning under Spurious Correlations

26 September 2024·2643 words·13 mins· loading · loading

AI Theory Fairness 🏢 Google DeepMind

Finetuning’s impact on worst-group accuracy is surprisingly nuanced, with common class-balancing methods sometimes hurting performance; a novel mixture method consistently outperforms others.

Stratified Prediction-Powered Inference for Effective Hybrid Evaluation of Language Models

26 September 2024·1611 words·8 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Google DeepMind

Stratified Prediction-Powered Inference (StratPPI) significantly improves language model evaluation by combining human and automated ratings, using stratified sampling for enhanced accuracy and tighte…

Stepping on the Edge: Curvature Aware Learning Rate Tuners

26 September 2024·2482 words·12 mins· loading · loading

Machine Learning Deep Learning 🏢 Google DeepMind

Adaptive learning rate tuners often underperform; Curvature Dynamics Aware Tuning (CDAT) prioritizes long-term curvature stabilization, outperforming tuned constant learning rates.

Small steps no more: Global convergence of stochastic gradient bandits for arbitrary learning rates

26 September 2024·447 words·3 mins· loading · loading

AI Generated Machine Learning Reinforcement Learning 🏢 Google DeepMind

Stochastic gradient bandit algorithms now guaranteed to globally converge, using ANY constant learning rate!

Simplified and Generalized Masked Diffusion for Discrete Data

26 September 2024·2082 words·10 mins· loading · loading

Natural Language Processing Text Generation 🏢 Google DeepMind

Simplified and generalized masked diffusion models achieve state-of-the-art results in discrete data generation, surpassing previous methods in text and image modeling.

ShiftAddLLM: Accelerating Pretrained LLMs via Post-Training Multiplication-Less Reparameterization

26 September 2024·3020 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

ShiftAddLLM accelerates pretrained LLMs via post-training, multiplication-less reparameterization, achieving significant memory and energy reductions with comparable or better accuracy than existing m…

SELF-DISCOVER: Large Language Models Self-Compose Reasoning Structures

26 September 2024·2441 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

LLMs self-discover optimal reasoning structures for complex problems, boosting performance by up to 32% compared to existing methods.

Schrodinger Bridge Flow for Unpaired Data Translation

26 September 2024·3752 words·18 mins· loading · loading

Transfer Learning 🏢 Google DeepMind

Accelerate unpaired data translation with Schrödinger Bridge Flow, a novel algorithm solving optimal transport problems efficiently without repeatedly training models!

Scaling Sign Language Translation

26 September 2024·4741 words·23 mins· loading · loading

AI Generated Natural Language Processing Machine Translation 🏢 Google DeepMind

Researchers dramatically improved sign language translation by scaling up data, model size, and the number of languages, achieving state-of-the-art results.

Recurrent Complex-Weighted Autoencoders for Unsupervised Object Discovery

26 September 2024·2697 words·13 mins· loading · loading

Computer Vision Image Segmentation 🏢 Google DeepMind

SynCx, a novel recurrent autoencoder with complex weights, surpasses state-of-the-art models in unsupervised object discovery by iteratively refining phase relationships to achieve robust object bindi…

Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models

26 September 2024·1757 words·9 mins· loading · loading

AI Theory Privacy 🏢 Google DeepMind

Researchers reveal ‘privacy backdoors,’ a new attack that exploits pre-trained models to leak user training data, highlighting critical vulnerabilities and prompting stricter model security measures.

Optimal Scalarizations for Sublinear Hypervolume Regret

26 September 2024·1664 words·8 mins· loading · loading

AI Theory Optimization 🏢 Google DeepMind

Optimal multi-objective optimization achieved via hypervolume scalarization, offering sublinear regret bounds and outperforming existing methods.