Skip to main content

Posters

2024

Understanding Representation of Deep Equilibrium Models from Neural Collapse Perspective
·1624 words·8 mins· loading · loading
Machine Learning Deep Learning 🏒 ShanghaiTech University
Deep Equilibrium Models excel on imbalanced data due to feature convergence and self-duality properties, unlike explicit models, as shown through Neural Collapse analysis.
Understanding Multi-Granularity for Open-Vocabulary Part Segmentation
·2683 words·13 mins· loading · loading
Computer Vision Image Segmentation 🏒 Graduate School of Artificial Intelligence, KAIST
PartCLIPSeg, a novel framework, leverages generalized parts and object-level contexts to achieve significant improvements in open-vocabulary part segmentation, outperforming state-of-the-art methods.
Understanding Model Selection for Learning in Strategic Environments
·394 words·2 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 California Institute of Technology
Larger machine learning models don’t always mean better performance; strategic interactions can reverse this trend, as this research shows, prompting a new paradigm for model selection in games.
Understanding Linear Probing then Fine-tuning Language Models from NTK Perspective
·3071 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏒 University of Tokyo
Linear probing then fine-tuning (LP-FT) significantly improves language model fine-tuning; this paper uses Neural Tangent Kernel (NTK) theory to explain why.
Understanding Information Storage and Transfer in Multi-Modal Large Language Models
·2906 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Microsoft Research
Researchers unveil how multi-modal LLMs process information, revealing that early layers are key for storage, and introduce MULTEDIT, a model-editing algorithm for correcting errors and inserting new …
Understanding Hallucinations in Diffusion Models through Mode Interpolation
·2934 words·14 mins· loading · loading
Computer Vision Image Generation 🏒 Carnegie Mellon University
Diffusion models generate unrealistic images by smoothly interpolating between data modes; this paper identifies this ‘mode interpolation’ failure and proposes a metric to detect and reduce it.
Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure
·3934 words·19 mins· loading · loading
Machine Learning Deep Learning 🏒 University of Michigan
Diffusion models’ surprising generalizability stems from an inductive bias towards learning Gaussian data structures, a finding that reshapes our understanding of their training and generalization.
Understanding Emergent Abilities of Language Models from the Loss Perspective
·1924 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Tsinghua University
Language model emergent abilities aren’t exclusive to large models; they emerge when pre-training loss falls below a threshold, irrespective of model or data size.
Understanding Bias in Large-Scale Visual Datasets
·5190 words·25 mins· loading · loading
AI Generated Computer Vision Image Classification 🏒 University of Pennsylvania
Researchers unveil a novel framework to dissect bias in large-scale visual datasets, identifying unique visual attributes and leveraging language models for detailed analysis, paving the way for creat…
Understanding and Minimising Outlier Features in Transformer Training
·5007 words·24 mins· loading · loading
Natural Language Processing Large Language Models 🏒 ETH Zurich
New methods minimize outlier features in transformer training, improving quantization and efficiency without sacrificing convergence speed.
Understanding and Improving Training-free Loss-based Diffusion Guidance
·2849 words·14 mins· loading · loading
AI Generated Computer Vision Image Generation 🏒 Microsoft Research
Training-free guidance revolutionizes diffusion models by enabling zero-shot conditional generation, but suffers from misaligned gradients and slow convergence. This paper provides theoretical analysi…
Understanding and Improving Adversarial Collaborative Filtering for Robust Recommendation
·2042 words·10 mins· loading · loading
AI Generated AI Theory Robustness 🏒 Chinese Academy of Sciences
PamaCF, a novel personalized adversarial collaborative filtering technique, significantly improves recommendation robustness and accuracy against poisoning attacks by dynamically adjusting perturbatio…
Uncovering the Redundancy in Graph Self-supervised Learning Models
·2804 words·14 mins· loading · loading
AI Generated Machine Learning Self-Supervised Learning 🏒 Beihang University
Graph self-supervised learning models surprisingly exhibit high redundancy, allowing for significant parameter reduction without performance loss. A novel framework, SLIDE, leverages this discovery f…
Uncovering Safety Risks of Large Language Models through Concept Activation Vector
·4605 words·22 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏒 Renmin University of China
Researchers developed SCAV, a novel framework to effectively reveal safety risks in LLMs by accurately interpreting their safety mechanisms. SCAV-guided attacks significantly improve attack success r…
Unconditional stability of a recurrent neural circuit implementing divisive normalization
·2728 words·13 mins· loading · loading
AI Generated Machine Learning Deep Learning 🏒 Courant Institute of Mathematical Sciences, NYU
Biologically-inspired ORGANICs neural circuit achieves dynamic divisive normalization, ensuring unconditional stability and seamless backpropagation training for high-dimensional recurrent networks.
Unchosen Experts Can Contribute Too: Unleashing MoE Models’ Power by Self-Contrast
·2047 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏒 Tsinghua University
Self-Contrast Mixture-of-Experts (SCMoE) boosts MoE model reasoning by cleverly using ‘unchosen’ experts during inference. This training-free method contrasts outputs from strong and weak expert acti…
Uncertainty-based Offline Variational Bayesian Reinforcement Learning for Robustness under Diverse Data Corruptions
·2152 words·11 mins· loading · loading
Machine Learning Reinforcement Learning 🏒 University of Science and Technology of China
TRACER, a novel robust offline RL algorithm, uses Bayesian inference to handle uncertainty from diverse data corruptions, significantly outperforming existing methods.
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in LLMs
·3889 words·19 mins· loading · loading
Natural Language Processing Large Language Models 🏒 National University of Singapore
Uncertainty of Thoughts (UoT) algorithm significantly boosts LLMs’ information-seeking abilities, leading to substantial performance gains across diverse tasks.
UMFC: Unsupervised Multi-Domain Feature Calibration for Vision-Language Models
·2814 words·14 mins· loading · loading
AI Generated Multimodal Learning Vision-Language Models 🏒 Institute of Computing Technology, Chinese Academy of Sciences
UMFC: Unsupervised Multi-domain Feature Calibration improves vision-language model transferability by mitigating inherent model biases via a novel, training-free feature calibration method.
UMB: Understanding Model Behavior for Open-World Object Detection
·3512 words·17 mins· loading · loading
AI Generated Computer Vision Object Detection 🏒 South China University of Technology
UMB: A novel model enhances open-world object detection by understanding model behavior, surpassing state-of-the-art with a 5.3 mAP gain for unknown classes.