Posters
2024
What Factors Affect Multi-Modal In-Context Learning? An In-Depth Exploration
·2619 words·13 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Central South University
Unlocking the full potential of multi-modal in-context learning requires understanding its core factors. This research systematically explores these factors, highlighting the importance of a multi-mod…
What does guidance do? A fine-grained analysis in a simple setting
·3498 words·17 mins·
loading
·
loading
AI Theory
Optimization
🏢 Duke University
Diffusion guidance, a common generative modeling technique, is shown to not sample from its intended distribution; instead, it heavily biases samples towards the boundary of the conditional distributi…
What do Graph Neural Networks learn? Insights from Tropical Geometry
·1465 words·7 mins·
loading
·
loading
AI Theory
Representation Learning
🏢 University of Edinburgh
Using tropical geometry, researchers reveal that ReLU-activated message-passing GNNs learn continuous piecewise linear functions, highlighting their expressivity limits and paving the way for enhanced…
WeiPer: OOD Detection using Weight Perturbations of Class Projections
·5838 words·28 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
🏢 Free University of Berlin
WeiPer enhances OOD detection by cleverly perturbing class projections, creating a richer representation that improves various existing methods and achieves state-of-the-art results.
Weight for Robustness: A Comprehensive Approach towards Optimal Fault-Tolerant Asynchronous ML
·1754 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Technion
Optimal fault-tolerant asynchronous machine learning is achieved via a novel weighted robust aggregation framework, ensuring efficient training despite Byzantine failures and heterogeneous resources.
Weight Diffusion for Future: Learn to Generalize in Non-Stationary Environments
·2419 words·12 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Tencent AI Lab
Weight Diffusion (W-Diff) masters evolving domain generalization by using conditional diffusion models to learn classifier weight evolution patterns, enabling superior generalization to unseen future …
Weight decay induces low-rank attention layers
·1731 words·9 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 ETH Zurich
Weight decay in deep learning surprisingly induces low-rank attention layers, potentially harming performance but offering optimization strategies for large language models.
Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach
·2153 words·11 mins·
loading
·
loading
Computer Vision
Visual Question Answering
🏢 Google DeepMind
LLM-powered data curation boosts web-scale visual entity recognition!
Weak-to-Strong Search: Align Large Language Models via Searching over Small Language Models
·2463 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Shanghai Artificial Intelligence Laboratory
Align LLMs efficiently via test-time search using smaller models!
Weak-eval-Strong: Evaluating and Eliciting Lateral Thinking of LLMs with Situation Puzzles
·1675 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Australian Institute for Machine Learning, University of Adelaide
SPLAT, a new benchmark using situation puzzles, effectively evaluates and elicits lateral thinking in LLMs through a multi-turn player-judge framework, revealing significant performance improvements o…
Weak Supervision Performance Evaluation via Partial Identification
·1757 words·9 mins·
loading
·
loading
Machine Learning
Semi-Supervised Learning
🏢 University of Michigan
This paper introduces a novel method for evaluating weakly supervised models using Fréchet bounds, providing reliable performance bounds without ground truth labels.
WaveAttack: Asymmetric Frequency Obfuscation-based Backdoor Attacks Against Deep Neural Networks
·2153 words·11 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 East China Normal University
WaveAttack: A new backdoor attack method leveraging asymmetric frequency obfuscation for high stealthiness and effectiveness in Deep Neural Networks.
WATT: Weight Average Test Time Adaptation of CLIP
·3263 words·16 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 ETS Montréal, Canada
WATT: a novel test-time adaptation method boosts CLIP’s performance on domain shifted images by cleverly averaging weights from multiple text prompts, achieving state-of-the-art results without extra …
WaterMax: breaking the LLM watermark detectability-robustness-quality trade-off
·2720 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Inria, CNRS, IRISA
WaterMax: a novel LLM watermarking scheme achieving high detectability and preserving text quality by cleverly generating multiple texts and selecting the most suitable one.
Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents
·2783 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Peking University
LLM-based agents are vulnerable to diverse backdoor attacks that manipulate their reasoning and outputs, highlighting the urgent need for targeted defenses.
Wasserstein Gradient Boosting: A Framework for Distribution-Valued Supervised Learning
·3031 words·15 mins·
loading
·
loading
AI Generated
Machine Learning
Deep Learning
🏢 University of Edinburgh
Wasserstein Gradient Boosting (WGBoost) extends gradient boosting to handle probability distributions as outputs, enabling more robust and informative predictions in various applications.
Wasserstein Distributionally Robust Optimization through the Lens of Structural Causal Models and Individual Fairness
·2363 words·12 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
🏢 Max Planck Institute for Intelligent Systems
This paper introduces Causally Fair DRO, a novel framework for robust optimization that addresses individual fairness concerns by incorporating causal structures and sensitive attributes, providing th…
Wasserstein Distance Rivals Kullback-Leibler Divergence for Knowledge Distillation
·2875 words·14 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Dalian University of Technology
Wasserstein Distance-based Knowledge Distillation (WKD) rivals KL-divergence by leveraging rich category interrelations and handling non-overlapping distributions, significantly boosting performance i…
Wasserstein convergence of Cech persistence diagrams for samplings of submanifolds
·1477 words·7 mins·
loading
·
loading
AI Theory
Representation Learning
🏢 Université Paris-Saclay, Inria
This paper proves that Čech persistence diagrams converge to the true underlying shape precisely when using Wasserstein distances with p > m, where m is the submanifold dimension, significantly advanc…
Warped Diffusion: Solving Video Inverse Problems with Image Diffusion Models
·2837 words·14 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 University of Texas at Austin
Warped Diffusion cleverly adapts image diffusion models for video inverse problems, solving flickering and temporal inconsistency issues by viewing video frames as continuous warping transformations a…