Skip to main content

🏢 Carleton University

LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
·4927 words·24 mins· loading · loading
Computer Vision Image Classification 🏢 Carleton University
LookHere: Vision Transformers excel at high-resolution image classification by using 2D attention masks to direct attention heads, improving generalization and extrapolation.