Skip to main content

🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts
·2803 words·14 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences
MemVLT: Adaptive Vision-Language Tracking leverages memory to generate dynamic prompts, surpassing existing methods by adapting to changing target appearances.
Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning
·2454 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences
CORY: a novel multi-agent RL framework boosts LLM fine-tuning!
Beyond Accuracy: Tracking more like Human via Visual Search
·2966 words·14 mins· loading · loading
Computer Vision Video Understanding 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences
CPDTrack: Human-like Visual Search Boosts Object Tracking!