🏢 Institute of Software Chinese Academy of Sciences
Rethinking Misalignment in Vision-Language Model Adaptation from a Causal Perspective
·2085 words·10 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Institute of Software Chinese Academy of Sciences
Vision-language model adaptation struggles with misalignment; this paper introduces Causality-Guided Semantic Decoupling and Classification (CDC) to mitigate this, boosting performance.