🏢 Swiss Federal Institute of Technology Lausanne (EPFL)
4M-21: An Any-to-Any Vision Model for Tens of Tasks and Modalities
·5846 words·28 mins·
loading
·
loading
AI Generated
Multimodal Learning
Vision-Language Models
🏢 Swiss Federal Institute of Technology Lausanne (EPFL)
4M-21 achieves any-to-any predictions across 21 diverse vision modalities using a single model, exceeding prior state-of-the-art performance.