🏢 UNC-Chapel Hill
MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
·441 words·3 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Multimodal Understanding
🏢 UNC-Chapel Hill
MDocAgent: Multi-agent Doc understanding by integrating text and image for better accuracy.
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
·3455 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Machine Learning
Unsupervised Learning
🏢 UNC-Chapel Hill
UPCORE reduces unintended unlearning effects via coreset selection, balancing knowledge removal and utility preservation.
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
·3751 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 UNC-Chapel Hill
DREAMRUNNER generates high-quality storytelling videos by using LLMs for hierarchical planning, motion retrieval, and a novel spatial-temporal region-based diffusion model for fine-grained control.