Skip to main content

🏢 UNC-Chapel Hill

MDocAgent: A Multi-Modal Multi-Agent Framework for Document Understanding
·441 words·3 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Multimodal Understanding 🏢 UNC-Chapel Hill
MDocAgent: Multi-agent Doc understanding by integrating text and image for better accuracy.
UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning
·3455 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Machine Learning Unsupervised Learning 🏢 UNC-Chapel Hill
UPCORE reduces unintended unlearning effects via coreset selection, balancing knowledge removal and utility preservation.
DreamRunner: Fine-Grained Storytelling Video Generation with Retrieval-Augmented Motion Adaptation
·3751 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 UNC-Chapel Hill
DREAMRUNNER generates high-quality storytelling videos by using LLMs for hierarchical planning, motion retrieval, and a novel spatial-temporal region-based diffusion model for fine-grained control.