↓Skip to main content

🏢 Dept. of Artificial Intelligence, Korea University

Text-Infused Attention and Foreground-Aware Modeling for Zero-Shot Temporal Action Detection

26 September 2024·2535 words·12 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 Dept. of Artificial Intelligence, Korea University

Ti-FAD: a novel zero-shot temporal action detection model outperforms state-of-the-art methods by enhancing text-related visual focus and foreground awareness.