Skip to main content

🏢 Xiamen University

QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension
·3039 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Xiamen University
QuoTA: Task-aware token assignment boosts long video comprehension in LVLMs via query-decoupled processing, without extra training!