🏢 AICV Lab, University of Arkansas
HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
·2128 words·10 mins·
loading
·
loading
AI Generated
Natural Language Processing
Vision-Language Models
🏢 AICV Lab, University of Arkansas
HENASY, a novel egocentric video-language model, uses a compositional approach to assemble scene entities for improved interpretability and performance.