🏢 ZIP Lab, Monash University
MiniCache: KV Cache Compression in Depth Dimension for Large Language Models
·3037 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 ZIP Lab, Monash University
MiniCache: A novel approach to drastically reduce LLM KV cache memory footprint.