Skip to main content

🏢 Saudi Data & Artificial Intelligence Authority

SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2774 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Saudi Data & Artificial Intelligence Authority
Fine-tuning small language models? Tweak the learning rate and batch size for a reasoning boost!