Skip to main content

🏢 DAMO Academy, Alibaba Group

SHMT: Self-supervised Hierarchical Makeup Transfer via Latent Diffusion Models
·3049 words·15 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 DAMO Academy, Alibaba Group
SHMT: Self-supervised Hierarchical Makeup Transfer uses latent diffusion models to realistically and precisely apply diverse makeup styles to faces, even without paired training data, achieving high f…
Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
·2384 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 DAMO Academy, Alibaba Group
Animate3D animates any 3D model using multi-view video diffusion, achieving superior spatiotemporal consistency and straightforward mesh animation.
Aligning Audio-Visual Joint Representations with an Agentic Workflow
·1961 words·10 mins· loading · loading
Multimodal Learning Audio-Visual Learning 🏢 DAMO Academy, Alibaba Group
AVAgent uses an LLM-driven workflow to intelligently align audio and visual data, resulting in improved AV joint representations and state-of-the-art performance on various downstream tasks.