Large Language Models

LLaNA: Large Language and NeRF Assistant

26 September 2024·4250 words·20 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Bologna

LLaNA: A novel Multimodal Large Language Model directly processes NeRF weights to enable NeRF captioning and Q&A, outperforming traditional 2D/3D-based methods.

LLaMo: Large Language Model-based Molecular Graph Assistant

26 September 2024·2751 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Korea University

LLaMo, a novel Large Language Model-based Molecular Graph Assistant, uses multi-level graph projection and instruction tuning to achieve superior performance on diverse molecular tasks.

Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack

26 September 2024·2933 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology

Lisa: a novel lazy safety alignment method safeguards LLMs against harmful fine-tuning attacks by introducing a proximal term to constrain model drift, significantly improving alignment performance.

LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning

26 September 2024·3222 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hong Kong University of Science and Technology

LISA, a layerwise importance sampling method, dramatically improves memory-efficient large language model fine-tuning, outperforming existing methods while using less GPU memory.

Linking In-context Learning in Transformers to Human Episodic Memory

26 September 2024·3883 words·19 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC San Diego

Transformers’ in-context learning mirrors human episodic memory, with specific attention heads acting like the brain’s contextual maintenance and retrieval system.

Linguistic Collapse: Neural Collapse in (Large) Language Models

26 September 2024·6528 words·31 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Toronto

Scaling causal language models reveals a connection between neural collapse properties, model size, and improved generalization, highlighting NC’s broader relevance to LLMs.

Limits of Transformer Language Models on Learning to Compose Algorithms

26 September 2024·2755 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 IBM Research

Large Language Models struggle with compositional tasks, requiring exponentially more data than expected for learning compared to learning sub-tasks individually. This paper reveals surprising sample …

Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models

26 September 2024·1918 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of British Columbia

This paper presents a fully automated method for PDDL translation and planning using LLMs and environment interaction, achieving a 66% success rate on challenging PDDL domains.

LeDex: Training LLMs to Better Self-Debug and Explain Code

26 September 2024·3820 words·18 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Purdue University

LEDEX: A novel training framework significantly boosts LLMs’ code self-debugging by using automated data collection, supervised fine-tuning, and reinforcement learning, leading to more accurate code a…

Learning to Reason via Program Generation, Emulation, and Search

26 September 2024·1757 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Johns Hopkins University

Language models excel at generating programs for algorithmic tasks, but struggle with soft reasoning. COGEX leverages pseudo-programs and program emulation to tackle these tasks, while COTACS searches…

Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks

26 September 2024·3569 words·17 mins· loading · loading

Large Language Models 🏢 Meta AI

Large language models surprisingly solve unseen arithmetic tasks; this work reveals how they learn to compose simple skills into complex ones through in-context learning, showing a transition from mem…

Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf

26 September 2024·2358 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Institute of Automation, Chinese Academy of Sciences

RL-instructed language models excel at strategic communication in One Night Ultimate Werewolf, demonstrating the importance of discussion tactics in complex games.

Learning Goal-Conditioned Representations for Language Reward Models

26 September 2024·3372 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Scale AI

Goal-conditioned contrastive learning boosts language reward model performance and enables better control of language model generation.

Learn To be Efficient: Build Structured Sparsity in Large Language Models

26 September 2024·2525 words·12 mins· loading · loading

Large Language Models 🏢 University of Michigan

Learn-To-be-Efficient (LTE) trains LLMs to achieve structured sparsity, boosting inference speed by 25% at 50% sparsity without sacrificing accuracy.

Learn more, but bother less: parameter efficient continual learning

26 September 2024·2442 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Pennsylvania State University

LB-CL: A novel parameter-efficient continual learning method for LLMs that boosts performance and reduces forgetting by leveraging parametric knowledge transfer and maintaining orthogonal low-rank sub…

Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models

26 September 2024·2300 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 KRAFTON

LaPael improves LLM knowledge injection by applying learned noise to early layers, enabling diverse and efficient knowledge updates without repeated external model usage.

Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning

26 September 2024·2160 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Beijing University of Technology

LLM-DA dynamically adapts LLM-generated rules for accurate, interpretable temporal knowledge graph reasoning, significantly improving accuracy without fine-tuning.

Large Language Models Must Be Taught to Know What They Don’t Know

26 September 2024·3020 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 New York University

Teach LLMs uncertainty for reliable high-stakes predictions: Fine-tuning with graded examples significantly improves LLM’s uncertainty calibration and generalizes well.

Large language model validity via enhanced conformal prediction methods

26 September 2024·2089 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Stanford University

New conformal inference methods enhance LLM validity by providing adaptive validity guarantees and improving the quality of LLM outputs, addressing prior methods’ limitations.

Large Language Model Unlearning via Embedding-Corrupted Prompts

26 September 2024·7618 words·36 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UC Santa Cruz

ECO prompts enable efficient LLM unlearning by corrupting prompts flagged for forgetting, achieving promising results across various LLMs and tasks with minimal side effects.