Skip to main content

🏢 AICV Lab, University of Arkansas

HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
·2128 words·10 mins· loading · loading
AI Generated Natural Language Processing Vision-Language Models 🏢 AICV Lab, University of Arkansas
HENASY, a novel egocentric video-language model, uses a compositional approach to assemble scene entities for improved interpretability and performance.