Skip to main content

🏢 Google Research

Who's asking? User personas and the mechanics of latent misalignment
·3650 words·18 mins· loading · loading
Large Language Models 🏢 Google Research
User personas significantly impact the safety of large language models, bypassing safety filters more effectively than direct prompting methods.
UniAR: A Unified model for predicting human Attention and Responses on visual content
·2440 words·12 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏢 Google Research
UniAR: A unified model predicts human attention and preferences across diverse visual content (images, webpages, designs), achieving state-of-the-art performance and enabling human-centric improvement…
Understanding Transformer Reasoning Capabilities via Graph Algorithms
·2280 words·11 mins· loading · loading
Natural Language Processing Question Answering 🏢 Google Research
Transformers excel at graph reasoning, with logarithmic depth proving necessary and sufficient for parallelizable tasks; single-layer transformers solve retrieval tasks efficiently.
Tight Bounds for Learning RUMs from Small Slates
·255 words·2 mins· loading · loading
AI Generated AI Theory Optimization 🏢 Google Research
Learning user preferences accurately from limited data is key; this paper shows that surprisingly small datasets suffice for precise prediction, and provides efficient algorithms to achieve this.
The Power of Resets in Online Reinforcement Learning
·233 words·2 mins· loading · loading
Reinforcement Learning 🏢 Google Research
Leveraging local simulator resets in online reinforcement learning dramatically improves sample efficiency, especially for high-dimensional problems with general function approximation.
The Impact of Geometric Complexity on Neural Collapse in Transfer Learning
·1870 words·9 mins· loading · loading
Machine Learning Transfer Learning 🏢 Google Research
Lowering a neural network’s geometric complexity during pre-training enhances neural collapse and improves transfer learning, especially in few-shot scenarios.
Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning
·3674 words·18 mins· loading · loading
AI Generated Computer Vision Image Classification 🏢 Google Research
Structured Unrestricted-Rank Matrices (SURMs) revolutionize parameter-efficient fine-tuning by offering greater flexibility and accuracy than existing methods like LoRA, achieving significant gains in…
SLED: Self Logits Evolution Decoding for Improving Factuality in Large Language Models
·2353 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Google Research
Self Logits Evolution Decoding (SLED) boosts LLM factuality by up to 20% without extra data or fine-tuning!
SequentialAttention++ for Block Sparsification: Differentiable Pruning Meets Combinatorial Optimization
·1692 words·8 mins· loading · loading
Machine Learning Deep Learning 🏢 Google Research
SequentialAttention++ unites differentiable pruning with combinatorial optimization for efficient and accurate neural network block sparsification, achieving state-of-the-art results.
Semantic Routing via Autoregressive Modeling
·2894 words·14 mins· loading · loading
Natural Language Processing AI Applications 🏢 Google Research
Learning-based semantic routing, a scalable approach to route planning using rich user queries, is introduced, accompanied by a large-scale public benchmark and a proof-of-concept model demonstrating …
Scalable DP-SGD: Shuffling vs. Poisson Subsampling
·2155 words·11 mins· loading · loading
AI Generated AI Theory Privacy 🏢 Google Research
This paper reveals significant privacy gaps in shuffling-based DP-SGD, proposes a scalable Poisson subsampling method, and demonstrates its superior utility for private model training.
Randomized Truthful Auctions with Learning Agents
·324 words·2 mins· loading · loading
AI Generated AI Theory Optimization 🏢 Google Research
Randomized truthful auctions outperform deterministic ones when bidders employ learning algorithms, maximizing revenue in repeated interactions.
PRODuctive bandits: Importance Weighting No More
·229 words·2 mins· loading · loading
AI Generated AI Theory Optimization 🏢 Google Research
Prod-family algorithms achieve optimal regret in adversarial multi-armed bandits, disproving prior suboptimality conjectures.
Position Coupling: Improving Length Generalization of Arithmetic Transformers Using Task Structure
·15685 words·74 mins· loading · loading
AI Generated AI Theory Generalization 🏢 Google Research
Position coupling, a novel method, enhances the length generalization ability of arithmetic Transformers by directly embedding task structures into positional encodings. This simple technique enables…
Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling
·1896 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Google Research
Orchid: a novel deep learning architecture using data-dependent convolution achieves quasilinear scalability and outperforms attention-based models on various sequence modeling tasks.
Optical Diffusion Models for Image Generation
·1966 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 Google Research
Researchers created an energy-efficient optical system for generating images using light propagation, drastically reducing the latency and energy consumption of diffusion models.
On the Inductive Bias of Stacking Towards Improving Reasoning
·2018 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Google Research
MIDAS: A novel training method improves language model reasoning by efficiently stacking middle layers, surprisingly boosting downstream task performance without increasing pretraining perplexity.
MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encoding
·2545 words·12 mins· loading · loading
Natural Language Processing Information Retrieval 🏢 Google Research
MUVERA: Revolutionizing multi-vector retrieval with single-vector speed and accuracy!
Multi-turn Reinforcement Learning with Preference Human Feedback
·1515 words·8 mins· loading · loading
Natural Language Processing Dialogue Systems 🏢 Google Research
Multi-turn RLHF surpasses single-turn methods by aligning LLMs with human preferences across entire conversations, not just individual turns. A novel mirror-descent algorithm, MTPO, is introduced, pr…
Linear Transformers are Versatile In-Context Learners
·1783 words·9 mins· loading · loading
Machine Learning Optimization 🏢 Google Research
Linear transformers surprisingly learn intricate optimization algorithms, even surpassing baselines on noisy regression problems, showcasing their unexpected learning capabilities.