🏢 Taobao & Tmall Group of Alibaba
DDK: Distilling Domain Knowledge for Efficient Large Language Models
·2140 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Taobao & Tmall Group of Alibaba
DDK: Dynamically Distilling Domain Knowledge for efficient LLMs.
D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models
·3930 words·19 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Taobao & Tmall Group of Alibaba
New D-CPT Law optimizes continual pre-training for LLMs by predicting optimal data mixture ratios, drastically cutting training costs.