🏢 Saudi Data & Artificial Intelligence Authority
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2774 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Saudi Data & Artificial Intelligence Authority
Fine-tuning small language models? Tweak the learning rate and batch size for a reasoning boost!