Skip to main content

🏢 ZIP Lab, Monash University

MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
·3037 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 ZIP Lab, Monash University
MiniCache: A novel approach to drastically reduce LLM KV cache memory footprint.