🏢 Carleton University
LookHere: Vision Transformers with Directed Attention Generalize and Extrapolate
·4927 words·24 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 Carleton University
LookHere: Vision Transformers excel at high-resolution image classification by using 2D attention masks to direct attention heads, improving generalization and extrapolation.