Skip to main content

🏢 RPTU Kaiserslautern-Landau

The Power of Hard Attention Transformers on Data Sequences: A formal language theoretic perspective
·284 words·2 mins· loading · loading
AI Generated AI Theory Generalization 🏢 RPTU Kaiserslautern-Landau
Hard attention transformers show surprisingly greater power when processing numerical data sequences, exceeding capabilities on string data; this advancement is theoretically analyzed via circuit comp…