Skip to main content

🏢 DAMO Academy, Alibaba Group

Babel: Open Multilingual Large Language Models Serving Over 90% of Global Speakers
·2242 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 DAMO Academy, Alibaba Group
Babel: An open multilingual LLM supports over 90% of global speakers, filling the language coverage gap and setting new performance standards.
VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding
·4124 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 DAMO Academy, Alibaba Group
VideoLLaMA3: Vision-centric training yields state-of-the-art image & video understanding!
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
·3571 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 DAMO Academy, Alibaba Group
VideoRefer Suite boosts video LLM understanding by introducing a large-scale, high-quality object-level video instruction dataset, a versatile spatial-temporal object encoder model, and a comprehensiv…