🏢 Nanjing University
Token-Budget-Aware LLM Reasoning
·3147 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Nanjing University
TALE: A novel framework dynamically adjusts token budgets in LLM reasoning prompts, slashing costs by ~70% with minimal accuracy loss.
StrandHead: Text to Strand-Disentangled 3D Head Avatars Using Hair Geometric Priors
·2185 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Nanjing University
Create realistic 3D heads with specific hairstyles from text, no 3D hair data needed!
InstanceCap: Improving Text-to-Video Generation via Instance-aware Structured Caption
·4018 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Nanjing University
InstanceCap improves text-to-video generation through detailed, instance-aware captions.
MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
·2416 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Nanjing University
This survey paper offers a comprehensive overview of Multimodal Large Language Model (MLLM) evaluation, systematically categorizing benchmarks and methods, and identifying gaps for future research, th…