↓Skip to main content

🏢 Yonsei University

Train-Attention: Meta-Learning Where to Focus in Continual Knowledge Learning

26 September 2024·330 words·2 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Yonsei University

Train-Attention (TAALM) tackles catastrophic forgetting in LLMs by dynamically weighting tokens during training, boosting learning efficiency and knowledge retention, outperforming existing methods on…

Graph Convolutions Enrich the Self-Attention in Transformers!

26 September 2024·4545 words·22 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Yonsei University

Graph Filter-based Self-Attention (GFSA) enhances Transformers by addressing oversmoothing, boosting performance across various tasks with minimal added parameters.

ANT: Adaptive Noise Schedule for Time Series Diffusion Models

26 September 2024·4333 words·21 mins· loading · loading

Machine Learning Deep Learning 🏢 Yonsei University

ANT: An adaptive noise schedule automatically determines optimal noise schedules for time series diffusion models, significantly boosting performance across diverse tasks.