Skip to main content

Spotlight Large Language Models

2024

DeTikZify: Synthesizing Graphics Programs for Scientific Figures and Sketches with TikZ
·2333 words·11 mins· loading · loading
Large Language Models 🏢 University of Mannheim
DeTikZify: AI synthesizes publication-ready scientific figures from sketches and existing figures, automatically generating semantically-preserving TikZ code.
Co-occurrence is not Factual Association in Language Models
·1941 words·10 mins· loading · loading
Large Language Models 🏢 Tsinghua University
Language models struggle to learn facts; this study reveals they prioritize word co-occurrence over true factual associations, proposing new training strategies for improved factual knowledge generali…
Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models
·2186 words·11 mins· loading · loading
Large Language Models 🏢 Peking University
Buffer of Thoughts (BoT) boosts Large Language Model reasoning by storing and reusing high-level ’thought-templates’, achieving significant accuracy and efficiency gains across diverse tasks.
Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation
·2524 words·12 mins· loading · loading
Large Language Models 🏢 Renmin University of China
Householder Reflection Adaptation (HRA) bridges low-rank and orthogonal LLM adaptation, achieving superior performance with fewer parameters than existing methods. By using a chain of Householder refl…
An Analysis of Tokenization: Transformers under Markov Data
·2141 words·11 mins· loading · loading
Large Language Models 🏢 University of California, Berkeley
Tokenization’s crucial role in transformer language models is revealed: Transformers struggle on simple Markov data without tokenization, but achieve near-optimal performance with appropriate tok…
A Phase Transition between Positional and Semantic Learning in a Solvable Model of Dot-Product Attention
·2455 words·12 mins· loading · loading
Large Language Models 🏢 EPFL, Lausanne, Switzerland
A solvable model reveals a phase transition in dot-product attention, showing how semantic attention emerges from positional attention with increased data, explaining the qualitative improvements in l…