🏢 Soochow University
Gated Slot Attention for Efficient Linear-Time Sequence Modeling
·2081 words·10 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Soochow University
Gated Slot Attention (GSA) enhances linear Transformers for efficient, real-time sequence modeling. GSA uses a two-layer gated linear attention structure linked by softmax, enabling improved memory ca…