↓Skip to main content

🏢 Carleton University

LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate

26 September 2024·4927 words·24 mins· loading · loading

Computer Vision Image Classification 🏢 Carleton University

LookHere: Vision Transformers excel at high-resolution image classification by using 2D attention masks to direct attention heads, improving generalization and extrapolation.