🏢 KAIST
MIVE: New Design and Benchmark for Multi-Instance Video Editing
·7714 words·37 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 KAIST
Edit many objects at once in videos! MIVE does it accurately without affecting other areas, a big step for AI video editing.
Controllable Human Image Generation with Personalized Multi-Garments
·4062 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 KAIST
BootComp: generate realistic human images wearing multiple garments using a novel synthetic data pipeline & diffusion model, enabling diverse applications like virtual try-on.
Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction
·2991 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 KAIST
CoordTok: a novel video tokenizer drastically reduces token count for long videos, enabling memory-efficient training of diffusion models for high-quality, long video generation.