🏢 Dept. of Artificial Intelligence, Korea University
Text-Infused Attention and Foreground-Aware Modeling for Zero-Shot Temporal Action Detection
·2535 words·12 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 Dept. of Artificial Intelligence, Korea University
Ti-FAD: a novel zero-shot temporal action detection model outperforms state-of-the-art methods by enhancing text-related visual focus and foreground awareness.