↓Skip to main content

🏢 Taobao & Tmall Group of Alibaba

DDK: Distilling Domain Knowledge for Efficient Large Language Models

26 September 2024·2140 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Taobao & Tmall Group of Alibaba

DDK: Dynamically Distilling Domain Knowledge for efficient LLMs.

D-CPT Law: Domain-specific Continual Pre-Training Scaling Law for Large Language Models

26 September 2024·3930 words·19 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Taobao & Tmall Group of Alibaba

New D-CPT Law optimizes continual pre-training for LLMs by predicting optimal data mixture ratios, drastically cutting training costs.