🏢 New York University
Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
·4503 words·22 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 New York University
Cambrian-1: Open, vision-centric multimodal LLMs achieve state-of-the-art performance using a novel spatial vision aggregator and high-quality data.
BAKU: An Efficient Transformer for Multi-Task Policy Learning
·4209 words·20 mins·
loading
·
loading
AI Applications
Robotics
🏢 New York University
BAKU: A simple transformer enables efficient multi-task robot policy learning, achieving 91% success on real-world tasks with limited data.