Skip to main content

🏢 Taobao & Tmall Group of Alibaba

DDK: Distilling Domain Knowledge for Efficient Large Language Models
·2140 words·11 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Taobao & Tmall Group of Alibaba
DDK: Dynamically Distilling Domain Knowledge for efficient LLMs.
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
·3930 words·19 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Taobao & Tmall Group of Alibaba
New D-CPT Law optimizes continual pre-training for LLMs by predicting optimal data mixture ratios, drastically cutting training costs.