Skip to main content

🏢 New York University

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs
·4503 words·22 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 New York University
Cambrian-1: Open, vision-centric multimodal LLMs achieve state-of-the-art performance using a novel spatial vision aggregator and high-quality data.
BAKU: An Efficient Transformer for Multi-Task Policy Learning
·4209 words·20 mins· loading · loading
AI Applications Robotics 🏢 New York University
BAKU: A simple transformer enables efficient multi-task robot policy learning, achieving 91% success on real-world tasks with limited data.