Skip to main content

🏢 Vrije Universiteit Brussel

Interpreting and Analysing CLIP's Zero-Shot Image Classification via Mutual Knowledge
·3358 words·16 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Vrije Universiteit Brussel
CLIP’s zero-shot image classification decisions are made interpretable using a novel mutual-knowledge approach based on textual concepts, demonstrating effective and human-friendly analysis across div…