🏢 Stanford University
Why are Visually-Grounded Language Models Bad at Image Classification?
·3661 words·18 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Stanford University
Visually-grounded Language Models (VLMs) surprisingly underperform in image classification. This study reveals that this is primarily due to a lack of sufficient classification data during VLM trainin…
Universal Neural Functionals
·1439 words·7 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Stanford University
Universal Neural Functionals (UNFs) automatically construct permutation-equivariant models for any weight space, improving learned optimizer performance and generalization.
Universal Exact Compression of Differentially Private Mechanisms
·1481 words·7 mins·
loading
·
loading
AI Theory
Privacy
🏢 Stanford University
Poisson Private Representation (PPR) enables exact compression of any local differential privacy mechanism, achieving order-wise optimal trade-offs between communication, accuracy, and privacy.
Truncated Variance Reduced Value Iteration
·1418 words·7 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 Stanford University
Faster algorithms for solving discounted Markov Decision Processes (DMDPs) are introduced, achieving near-optimal sample and time complexities, especially in the sample setting and improving runtimes …
TrAct: Making First-layer Pre-Activations Trainable
·2254 words·11 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Stanford University
TrAct boosts vision model training by directly optimizing first-layer activations, leading to significant speedups (1.25x-4x) and improved accuracy.
TFG: Unified Training-Free Guidance for Diffusion Models
·3585 words·17 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Stanford University
TFG: A unified, training-free framework for boosting diffusion model performance by efficiently searching its algorithm-agnostic design space.
Test-time Adaptation in Non-stationary Environments via Adaptive Representation Alignment
·2451 words·12 mins·
loading
·
loading
AI Generated
Machine Learning
Representation Learning
🏢 Stanford University
Ada-ReAlign: a novel algorithm for continual test-time adaptation that leverages non-stationary representation learning to effectively align unlabeled data streams with source data, enhancing model ad…
SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention
·3239 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
SwitchHead: A novel MoE attention mechanism accelerates Transformers by significantly reducing computation and memory, matching baseline performance.
Structured flexibility in recurrent neural networks via neuromodulation
·1567 words·8 mins·
loading
·
loading
AI Theory
Representation Learning
🏢 Stanford University
Neuromodulated RNNs (NM-RNNs) enhance RNN flexibility by dynamically scaling recurrent weights using a neuromodulatory subnetwork, achieving higher accuracy and generalizability on various tasks compa…
Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution
·2842 words·14 mins·
loading
·
loading
AI Theory
Interpretability
🏢 Stanford University
Stochastic Amortization accelerates feature and data attribution by training amortized models using noisy, yet unbiased, labels, achieving order-of-magnitude speedups over existing methods.
Spectral Adapter: Fine-Tuning in Spectral Space
·3909 words·19 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
Spectral Adapter boosts parameter-efficient fine-tuning by incorporating pretrained weight matrices’ spectral information, enhancing efficiency and multi-adapter fusion.
Smoothie: Label Free Language Model Routing
·3245 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
SMOOTHIE: Label-free LLM routing achieves up to 10% accuracy gains by using a latent variable model to estimate LLM quality without labeled data.
Self-Supervised Alignment with Mutual Information: Learning to Follow Principles without Preference Labels
·5609 words·27 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
SAMI: Self-Supervised Alignment with Mutual Information, effectively teaches language models to follow principles without human preference labels by maximizing the mutual information between principle…
Self-Refining Diffusion Samplers: Enabling Parallelization via Parareal Iterations
·2449 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Stanford University
Self-Refining Diffusion Samplers (SRDS) dramatically speeds up diffusion model sampling by leveraging Parareal iterations for parallel-in-time computation, maintaining high-quality outputs.
Segment Any Change
·2244 words·11 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Stanford University
AnyChange achieves zero-shot image change detection by adapting the Segment Anything Model (SAM) via a training-free bitemporal latent matching method, significantly outperforming previous state-of-th…
Scaling Laws with Vocabulary: Larger Models Deserve Larger Vocabularies
·2496 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Stanford University
Boosting LLM performance: This research shows how larger language models need bigger vocabularies for optimal efficiency and performance.
Scaling Laws for Reward Model Overoptimization in Direct Alignment Algorithms
·2949 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Stanford University
Direct Alignment Algorithms (DAAs) for LLM alignment suffer from over-optimization, even without explicit reward models; this paper empirically demonstrates this and proposes scaling laws to understan…
ReFT: Representation Finetuning for Language Models
·3382 words·16 mins·
loading
·
loading
Large Language Models
🏢 Stanford University
ReFT: Revolutionizing language model finetuning by directly manipulating hidden representations, achieving superior efficiency and performance compared to existing methods.
RaVL: Discovering and Mitigating Spurious Correlations in Fine-Tuned Vision-Language Models
·2441 words·12 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Stanford University
RAVL: a novel approach that accurately discovers and effectively mitigates spurious correlations in fine-tuned vision-language models, improving zero-shot classification accuracy.
Quantifying the Gain in Weak-to-Strong Generalization
·2368 words·12 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Stanford University
Weakly supervised strong models outperform weak models; this gain is precisely quantified by the strong model’s misfit error on weak labels.