🏢 University of Technology Sydney
VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation
·1959 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Multimodal Generation
🏢 University of Technology Sydney
VideoUFO: A new user-focused, million-scale dataset that improves text-to-video generation by aligning training data with real user interests and preferences!
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation
·2197 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 University of Technology Sydney
TIP-I2V: A million-scale dataset provides 1.7 million real user text & image prompts for image-to-video generation, boosting model development and safety.