Skip to main content

Oral Large Language Models

2024

You Only Cache Once: Decoder-Decoder Architectures for Language Models
·2411 words·12 mins· loading · loading
Large Language Models 🏢 Tsinghua University
YOCO: A decoder-decoder architecture for LLMs dramatically reduces memory usage and improves inference speed by caching key-value pairs only once.
Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought
·2755 words·13 mins· loading · loading
Large Language Models 🏢 Chinese University of Hong Kong
Reasoning Boundary Framework (RBF) quantitatively assesses and optimizes chain-of-thought (CoT) in LLMs, offering novel metrics and optimization strategies validated across various models and tasks.
Questioning the Survey Responses of Large Language Models
·2706 words·13 mins· loading · loading
Large Language Models 🏢 Max Planck Institute for Intelligent Systems
LLM survey responses are systematically biased, often masking genuine model capabilities and leading to misleading alignment conclusions.
PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression
·3701 words·18 mins· loading · loading
Large Language Models 🏢 Yandex
PV-Tuning achieves new state-of-the-art in extreme LLM compression by going beyond traditional straight-through estimators (STE). This novel framework provides a more accurate and efficient fine-tunin…
Policy Learning from Tutorial Books via Understanding, Rehearsing and Introspecting
·3011 words·15 mins· loading · loading
Large Language Models 🏢 Nanjing University
Researchers developed Policy Learning from tutorial Books (PLfB), a novel method that trains AI agents using knowledge from tutorial books instead of relying solely on real-world data.
Not All Tokens Are What You Need for Pretraining
·2178 words·11 mins· loading · loading
Large Language Models 🏢 Tsinghua University
RHO-1, a novel language model, uses selective pretraining focusing on high-value tokens, achieving state-of-the-art results with significantly less data than existing models.
MetaLA: Unified Optimal Linear Approximation to Softmax Attention Map
·2608 words·13 mins· loading · loading
Large Language Models 🏢 Hong Kong Polytechnic University
MetaLA: Unified optimal linear approximation to softmax attention map, achieving linear complexity and surpassing existing models in various benchmarks.
LLM Evaluators Recognize and Favor Their Own Generations
·3818 words·18 mins· loading · loading
Large Language Models 🏢 MATS
LLMs show self-preference bias in evaluations, favoring their own outputs. This study reveals that LLMs surprisingly recognize their own generations, and this self-recognition directly causes the self…
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks
·3569 words·17 mins· loading · loading
Large Language Models 🏢 Meta AI
Large language models surprisingly solve unseen arithmetic tasks; this work reveals how they learn to compose simple skills into complex ones through in-context learning, showing a transition from mem…
HydraLoRA: An Asymmetric LoRA Architecture for Efficient Fine-Tuning
·1914 words·9 mins· loading · loading
Large Language Models 🏢 University of Texas at Austin
HydraLoRA: Asymmetric LoRA boosts LLM fine-tuning efficiency by sharing parameters across tasks while specializing others, outperforming existing methods.
DuQuant: Distributing Outliers via Dual Transformation Makes Stronger Quantized LLMs
·4529 words·22 mins· loading · loading
Large Language Models 🏢 Tsinghua University
DuQuant: Dual transformations distribute outliers for stronger quantized LLMs.
Divide-and-Conquer Meets Consensus: Unleashing the Power of Functions in Code Generation
·2382 words·12 mins· loading · loading
Large Language Models 🏢 Harbin Institute of Technology
FUNCODER: a novel code generation framework that uses a divide-and-conquer approach with functional consensus to generate code that meets complex requirements.
Aligner: Efficient Alignment by Learning to Correct
·3091 words·15 mins· loading · loading
Large Language Models 🏢 Peking University
Aligner efficiently aligns LLMs by learning to correct initial responses, achieving significant improvements in helpfulness and harmlessness across various models with resource efficiency.