🏢 State Key Laboratory for Novel Software Technology, Nanjing University
Bridge the Modality and Capability Gaps in Vision-Language Model Selection
·3390 words·16 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 State Key Laboratory for Novel Software Technology, Nanjing University
SWAB bridges modality and capability gaps in Vision-Language Model selection using optimal transport, enabling accurate prediction of VLM performance without images.
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
·2546 words·12 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 State Key Laboratory for Novel Software Technology, Nanjing University
AWT: a novel framework boosts vision-language model’s zero-shot capabilities by augmenting inputs, weighting them dynamically, and leveraging optimal transport to enhance semantic correlations.
AP-Adapter: Improving Generalization of Automatic Prompts on Unseen Text-to-Image Diffusion Models
·2738 words·13 mins·
loading
·
loading
Natural Language Processing
Text Generation
🏢 State Key Laboratory for Novel Software Technology, Nanjing University
AP-Adapter boosts text-to-image diffusion model generalization by using a two-stage prompt optimization method that leverages large language models and inter-model differences.
A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context Reasoning
·2451 words·12 mins·
loading
·
loading
Natural Language Processing
Question Answering
🏢 State Key Laboratory for Novel Software Technology, Nanjing University
KG-ICL, a novel prompt-based knowledge graph foundation model, achieves universal in-context reasoning by leveraging in-context learning and a unified tokenizer, outperforming various baselines on 43 …