Skip to main content

🏢 Indiana University

SDP4Bit: Toward 4-bit Communication Quantization in Sharded Data Parallelism for LLM Training
·2596 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Indiana University
SDP4Bit achieves up to 4.08x speedup in LLM training by quantizing weight differences and gradients to ~4 bits, maintaining accuracy.