↓Skip to main content

🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

MemVLT: Vision-Language Tracking with Adaptive Memory-based Prompts

26 September 2024·2803 words·14 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

MemVLT: Adaptive Vision-Language Tracking leverages memory to generate dynamic prompts, surpassing existing methods by adapting to changing target appearances.

Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

26 September 2024·2454 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

CORY: a novel multi-agent RL framework boosts LLM fine-tuning!

Beyond Accuracy: Tracking more like Human via Visual Search

26 September 2024·2966 words·14 mins· loading · loading

Computer Vision Video Understanding 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

CPDTrack: Human-like Visual Search Boosts Object Tracking!