Skip to main content

🏢 Institute of Software Chinese Academy of Sciences

Rethinking Misalignment in Vision-Language Model Adaptation from a Causal Perspective
·2085 words·10 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Institute of Software Chinese Academy of Sciences
Vision-language model adaptation struggles with misalignment; this paper introduces Causality-Guided Semantic Decoupling and Classification (CDC) to mitigate this, boosting performance.