Oral Large Language Models

You Only Cache Once: Decoder-Decoder Architectures for Language Models

26 September 2024·2411 words·12 mins· loading · loading

Large Language Models 🏢 Tsinghua University

YOCO: A decoder-decoder architecture for LLMs dramatically reduces memory usage and improves inference speed by caching key-value pairs only once.

Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought

26 September 2024·2755 words·13 mins· loading · loading

Large Language Models 🏢 Chinese University of Hong Kong

Reasoning Boundary Framework (RBF) quantitatively assesses and optimizes chain-of-thought (CoT) in LLMs, offering novel metrics and optimization strategies validated across various models and tasks.

Questioning the Survey Responses of Large Language Models

26 September 2024·2706 words·13 mins· loading · loading

Large Language Models 🏢 Max Planck Institute for Intelligent Systems

LLM survey responses are systematically biased, often masking genuine model capabilities and leading to misleading alignment conclusions.

PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression

26 September 2024·3701 words·18 mins· loading · loading

Large Language Models 🏢 Yandex

PV-Tuning achieves new state-of-the-art in extreme LLM compression by going beyond traditional straight-through estimators (STE). This novel framework provides a more accurate and efficient fine-tunin…

Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting

26 September 2024·3011 words·15 mins· loading · loading

Large Language Models 🏢 Nanjing University

Researchers developed Policy Learning from tutorial Books (PLfB), a novel method that trains AI agents using knowledge from tutorial books instead of relying solely on real-world data.

Not All Tokens Are What You Need for Pretraining

26 September 2024·2178 words·11 mins· loading · loading

Large Language Models 🏢 Tsinghua University

RHO-1, a novel language model, uses selective pretraining focusing on high-value tokens, achieving state-of-the-art results with significantly less data than existing models.

MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map

26 September 2024·2608 words·13 mins· loading · loading

Large Language Models 🏢 Hong Kong Polytechnic University

MetaLA: Unified optimal linear approximation to softmax attention map, achieving linear complexity and surpassing existing models in various benchmarks.

LLM Evaluators Recognize and Favor Their Own Generations

26 September 2024·3818 words·18 mins· loading · loading

Large Language Models 🏢 MATS

LLMs show self-preference bias in evaluations, favoring their own outputs. This study reveals that LLMs surprisingly recognize their own generations, and this self-recognition directly causes the self…

Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks

26 September 2024·3569 words·17 mins· loading · loading

Large Language Models 🏢 Meta AI

Large language models surprisingly solve unseen arithmetic tasks; this work reveals how they learn to compose simple skills into complex ones through in-context learning, showing a transition from mem…

HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning

26 September 2024·1914 words·9 mins· loading · loading

Large Language Models 🏢 University of Texas at Austin

HydraLoRA: Asymmetric LoRA boosts LLM fine-tuning efficiency by sharing parameters across tasks while specializing others, outperforming existing methods.

DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs

26 September 2024·4529 words·22 mins· loading · loading

Large Language Models 🏢 Tsinghua University

DuQuant: Dual transformations distribute outliers for stronger quantized LLMs.

Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation

26 September 2024·2382 words·12 mins· loading · loading

Large Language Models 🏢 Harbin Institute of Technology

FUNCODER: a novel code generation framework that uses a divide-and-conquer approach with functional consensus to generate code that meets complex requirements.

Aligner: Efficient Alignment by Learning to Correct

26 September 2024·3091 words·15 mins· loading · loading

Large Language Models 🏢 Peking University

Aligner efficiently aligns LLMs by learning to correct initial responses, achieving significant improvements in helpfulness and harmlessness across various models with resource efficiency.