🏢 Kuaishou Technology
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models
·3091 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Kuaishou Technology
HAIC improves MLLMs’ action understanding with high-quality video captions & new benchmark, boosting performance and generation.