Skip to main content

Large Language Models

Constraint Back-translation Improves Complex Instruction Following of Large Language Models
·3717 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University
Constraint Back-translation enhances complex instruction following in LLMs by leveraging inherent constraints in existing datasets for efficient high-quality data creation.
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
·6027 words·29 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Fudan University
BitStack: Dynamic LLM sizing for variable memory!
Controlling Language and Diffusion Models by Transporting Activations
·11502 words·54 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Apple
Steering large language and diffusion models is made easy and efficient via Activation Transport (ACT)! This novel framework uses optimal transport theory to precisely control model activations, leadi…
AAAR-1.0: Assessing AI's Potential to Assist Research
·5113 words·25 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Pennsylvania State University
AAAR-1.0 benchmark rigorously evaluates LLMs’ ability to assist in four core research tasks, revealing both potential and limitations.
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
·2943 words·14 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Alberta
NeuZip dynamically compresses neural network weights, achieving memory-efficient training and inference without performance loss, significantly reducing the memory footprint of large language models.
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
·4787 words·23 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Alibaba Group
M2RC-EVAL: A new massively multilingual benchmark for repository-level code completion, featuring fine-grained annotations and a large instruction dataset, enabling better evaluation of code LLMs acro…