🏢 Samsung AI Cambridge
Discriminative Fine-tuning of LVLMs
·4145 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Samsung AI Cambridge
VladVA: A novel training framework converts generative LVLMs into powerful discriminative models, achieving state-of-the-art performance on image-text retrieval and compositionality benchmarks.