Skip to main content

🏢 National Yang Ming Chiao Tung University

NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
·2217 words·11 mins· loading · loading
Computer Vision Video Understanding 🏢 National Yang Ming Chiao Tung University
NaRCan: High-quality video editing via diffusion priors and hybrid deformation fields.
A Cat Is A Cat (Not A Dog!): Unraveling Information Mix-ups in Text-to-Image Encoders through Causal Analysis and Embedding Optimization
·3344 words·16 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 National Yang Ming Chiao Tung University
Researchers unveil how causal text encoding in text-to-image models leads to information loss and bias, proposing a novel training-free optimization method that significantly improves information bala…