🏢 MMLab, the Chinese University of Hong Kong
DiTCtrl: Exploring Attention Control in Multi-Modal Diffusion Transformer for Tuning-Free Multi-Prompt Longer Video Generation
·3843 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 MMLab, the Chinese University of Hong Kong
DiTCtrl achieves state-of-the-art multi-prompt video generation without retraining by cleverly controlling attention in a diffusion transformer, enabling smooth transitions between video segments.