Skip to main content

🏢 Department of Computer Science, Purdue University

Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
·3055 words·15 mins· loading · loading
AI Generated Natural Language Processing Vision-Language Models 🏢 Department of Computer Science, Purdue University
D-LISA: Dynamic modules & language-informed spatial attention revolutionizes multi-object 3D grounding, surpassing state-of-the-art accuracy by 12.8%.