🏢 University of Edinburgh
Multimodal Music Generation with Explicit Bridges and Retrieval Augmentation
·3107 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Multimodal Generation
🏢 University of Edinburgh
VMB generates music from videos, images, and text, using description and retrieval bridges to improve quality and controllability.