🏢 University of Miami
Dissecting Query-Key Interaction in Vision Transformers
·3134 words·15 mins·
loading
·
loading
Vision Transformers
🏢 University of Miami
Vision transformers’ self-attention mechanism is dissected revealing how early layers focus on similar features for perceptual grouping while later layers integrate dissimilar features for contextuali…