Skip to main content

🏢 SenseTime Research

SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters
·3277 words·16 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Human-AI Interaction 🏢 SenseTime Research
SOLAMI: enabling immersive, natural interactions with 3D characters via a unified social vision-language-action model and a novel synthetic multimodal dataset.