Skip to main content

🏢 Dept. of Artificial Intelligence, Korea University

Text-Infused Attention and Foreground-Aware Modeling for Zero-Shot Temporal Action Detection
·2535 words·12 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 Dept. of Artificial Intelligence, Korea University
Ti-FAD: a novel zero-shot temporal action detection model outperforms state-of-the-art methods by enhancing text-related visual focus and foreground awareness.