🏢 Department of Computer Science, Purdue University
Multi-Object 3D Grounding with Dynamic Modules and Language-Informed Spatial Attention
·3055 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 Department of Computer Science, Purdue University
D-LISA: Dynamic modules & language-informed spatial attention revolutionizes multi-object 3D grounding, surpassing state-of-the-art accuracy by 12.8%.