Skip to main content

🏢 Nanjing University

Token-Budget-Aware LLM Reasoning
·3147 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Nanjing University
TALE: A novel framework dynamically adjusts token budgets in LLM reasoning prompts, slashing costs by ~70% with minimal accuracy loss.
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors
·2185 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Nanjing University
Create realistic 3D heads with specific hairstyles from text, no 3D hair data needed!
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption
·4018 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Nanjing University
InstanceCap improves text-to-video generation through detailed, instance-aware captions.
MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
·2416 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Nanjing University
This survey paper offers a comprehensive overview of Multimodal Large Language Model (MLLM) evaluation, systematically categorizing benchmarks and methods, and identifying gaps for future research, th…