Skip to main content

🏢 State Key Laboratory for Novel Software Technology, Nanjing University

Bridge the Modality and Capability Gaps in Vision-Language Model Selection
·3390 words·16 mins· loading · loading
AI Generated Natural Language Processing Vision-Language Models 🏢 State Key Laboratory for Novel Software Technology, Nanjing University
SWAB bridges modality and capability gaps in Vision-Language Model selection using optimal transport, enabling accurate prediction of VLM performance without images.
AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
·2546 words·12 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 State Key Laboratory for Novel Software Technology, Nanjing University
AWT: a novel framework boosts vision-language model’s zero-shot capabilities by augmenting inputs, weighting them dynamically, and leveraging optimal transport to enhance semantic correlations.
AP-Adapter: Improving Generalization of Automatic Prompts on Unseen Text-to-Image Diffusion Models
·2738 words·13 mins· loading · loading
Natural Language Processing Text Generation 🏢 State Key Laboratory for Novel Software Technology, Nanjing University
AP-Adapter boosts text-to-image diffusion model generalization by using a two-stage prompt optimization method that leverages large language models and inter-model differences.
A Prompt-Based Knowledge Graph Foundation Model for Universal In-Context Reasoning
·2451 words·12 mins· loading · loading
Natural Language Processing Question Answering 🏢 State Key Laboratory for Novel Software Technology, Nanjing University
KG-ICL, a novel prompt-based knowledge graph foundation model, achieves universal in-context reasoning by leveraging in-context learning and a unified tokenizer, outperforming various baselines on 43 …