Spotlight Large Language Models

DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ

26 September 2024·2333 words·11 mins· loading · loading

Large Language Models 🏢 University of Mannheim

DeTikZify: AI synthesizes publication-ready scientific figures from sketches and existing figures, automatically generating semantically-preserving TikZ code.

Co-occurrence is not Factual Association in Language Models

26 September 2024·1941 words·10 mins· loading · loading

Large Language Models 🏢 Tsinghua University

Language models struggle to learn facts; this study reveals they prioritize word co-occurrence over true factual associations, proposing new training strategies for improved factual knowledge generali…

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

26 September 2024·2186 words·11 mins· loading · loading

Large Language Models 🏢 Peking University

Buffer of Thoughts (BoT) boosts Large Language Model reasoning by storing and reusing high-level ’thought-templates’, achieving significant accuracy and efficiency gains across diverse tasks.

Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

26 September 2024·2524 words·12 mins· loading · loading

Large Language Models 🏢 Renmin University of China

Householder Reflection Adaptation (HRA) bridges low-rank and orthogonal LLM adaptation, achieving superior performance with fewer parameters than existing methods. By using a chain of Householder refl…

An Analysis of Tokenization: Transformers under Markov Data

26 September 2024·2141 words·11 mins· loading · loading

Large Language Models 🏢 University of California, Berkeley

Tokenization’s crucial role in transformer language models is revealed: Transformers struggle on simple Markov data without tokenization, but achieve near-optimal performance with appropriate tok…

A Phase Transition between Positional and Semantic Learning in a Solvable Model of Dot-Product Attention

26 September 2024·2455 words·12 mins· loading · loading

Large Language Models 🏢 EPFL, Lausanne, Switzerland

A solvable model reveals a phase transition in dot-product attention, showing how semantic attention emerges from positional attention with increased data, explaining the qualitative improvements in l…