🏢 Xiamen University
QuoTA: Query-oriented Token Assignment via CoT Query Decouple for Long Video Comprehension
·3039 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Xiamen University
QuoTA: Task-aware token assignment boosts long video comprehension in LVLMs via query-decoupled processing, without extra training!