Skip to main content

🏢 Ajou University

UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages
·2221 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Ajou University
UnifiedCrawl efficiently harvests massive monolingual datasets for low-resource languages from Common Crawl, enabling affordable LLM adaptation via QLoRA, significantly improving performance.