🏢 Yonsei University
Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning
·330 words·2 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Yonsei University
Train-Attention (TAALM) tackles catastrophic forgetting in LLMs by dynamically weighting tokens during training, boosting learning efficiency and knowledge retention, outperforming existing methods on…
Graph Convolutions Enrich the Self-Attention in Transformers!
·4545 words·22 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Yonsei University
Graph Filter-based Self-Attention (GFSA) enhances Transformers by addressing oversmoothing, boosting performance across various tasks with minimal added parameters.
ANT: Adaptive Noise Schedule for Time Series Diffusion Models
·4333 words·21 mins·
loading
·
loading
Machine Learning
Deep Learning
🏢 Yonsei University
ANT: An adaptive noise schedule automatically determines optimal noise schedules for time series diffusion models, significantly boosting performance across diverse tasks.