Skip to main content

🏢 University of California, San Diego

Fast Video Generation with Sliding Tile Attention
·4012 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 University of California, San Diego
Sliding Tile Attention (STA) boosts video generation speed by 2.43-3.53x without losing quality by exploiting inherent data redundancy in video diffusion models.
Personalized Multimodal Large Language Models: A Survey
·599 words·3 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 University of California, San Diego
This survey reveals the exciting advancements in personalized multimodal large language models (MLLMs), offering a novel taxonomy, highlighting key challenges and applications, ultimately pushing the …