Skip to main content

🏢 Shanghai Academy of Artificial Intelligence for Science

Cockatiel: Ensembling Synthetic and Human Preferenced Training for Detailed Video Caption
·3100 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Shanghai Academy of Artificial Intelligence for Science
Cockatiel: Ensembling synthetic & human-preferred training boosts detailed video captioning, setting new SOTA on VDCSCORE.