↓Skip to main content

🏢 ShanghaiTech University

Learning Video Representations without Natural Videos

31 October 2024·3154 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 ShanghaiTech University

High-performing video representation models can be trained using only synthetic videos and images, eliminating the need for large natural video datasets.