Large Language Models
LLaNA: Large Language and NeRF Assistant
·4250 words·20 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ University of Bologna
LLaNA: A novel Multimodal Large Language Model directly processes NeRF weights to enable NeRF captioning and Q&A, outperforming traditional 2D/3D-based methods.
LLaMo: Large Language Model-based Molecular Graph Assistant
·2751 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Korea University
LLaMo, a novel Large Language Model-based Molecular Graph Assistant, uses multi-level graph projection and instruction tuning to achieve superior performance on diverse molecular tasks.
Lisa: Lazy Safety Alignment for Large Language Models against Harmful Fine-tuning Attack
·2933 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Georgia Institute of Technology
Lisa: a novel lazy safety alignment method safeguards LLMs against harmful fine-tuning attacks by introducing a proximal term to constrain model drift, significantly improving alignment performance.
LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning
·3222 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Hong Kong University of Science and Technology
LISA, a layerwise importance sampling method, dramatically improves memory-efficient large language model fine-tuning, outperforming existing methods while using less GPU memory.
Linking In-context Learning in Transformers to Human Episodic Memory
·3883 words·19 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ UC San Diego
Transformers’ in-context learning mirrors human episodic memory, with specific attention heads acting like the brain’s contextual maintenance and retrieval system.
Linguistic Collapse: Neural Collapse in (Large) Language Models
·6528 words·31 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ University of Toronto
Scaling causal language models reveals a connection between neural collapse properties, model size, and improved generalization, highlighting NC’s broader relevance to LLMs.
Limits of Transformer Language Models on Learning to Compose Algorithms
·2755 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ IBM Research
Large Language Models struggle with compositional tasks, requiring exponentially more data than expected for learning compared to learning sub-tasks individually. This paper reveals surprising sample …
Leveraging Environment Interaction for Automated PDDL Translation and Planning with Large Language Models
·1918 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ University of British Columbia
This paper presents a fully automated method for PDDL translation and planning using LLMs and environment interaction, achieving a 66% success rate on challenging PDDL domains.
LeDex: Training LLMs to Better Self-Debug and Explain Code
·3820 words·18 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
π’ Purdue University
LEDEX: A novel training framework significantly boosts LLMs’ code self-debugging by using automated data collection, supervised fine-tuning, and reinforcement learning, leading to more accurate code a…
Learning to Reason via Program Generation, Emulation, and Search
·1757 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Johns Hopkins University
Language models excel at generating programs for algorithmic tasks, but struggle with soft reasoning. COGEX leverages pseudo-programs and program emulation to tackle these tasks, while COTACS searches…
Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks
·3569 words·17 mins·
loading
·
loading
Large Language Models
π’ Meta AI
Large language models surprisingly solve unseen arithmetic tasks; this work reveals how they learn to compose simple skills into complex ones through in-context learning, showing a transition from mem…
Learning to Discuss Strategically: A Case Study on One Night Ultimate Werewolf
·2358 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Institute of Automation, Chinese Academy of Sciences
RL-instructed language models excel at strategic communication in One Night Ultimate Werewolf, demonstrating the importance of discussion tactics in complex games.
Learning Goal-Conditioned Representations for Language Reward Models
·3372 words·16 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Scale AI
Goal-conditioned contrastive learning boosts language reward model performance and enables better control of language model generation.
Learn To be Efficient: Build Structured Sparsity in Large Language Models
·2525 words·12 mins·
loading
·
loading
Large Language Models
π’ University of Michigan
Learn-To-be-Efficient (LTE) trains LLMs to achieve structured sparsity, boosting inference speed by 25% at 50% sparsity without sacrificing accuracy.
Learn more, but bother less: parameter efficient continual learning
·2442 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Pennsylvania State University
LB-CL: A novel parameter-efficient continual learning method for LLMs that boosts performance and reduces forgetting by leveraging parametric knowledge transfer and maintaining orthogonal low-rank sub…
Latent Paraphrasing: Perturbation on Layers Improves Knowledge Injection in Language Models
·2300 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ KRAFTON
LaPael improves LLM knowledge injection by applying learned noise to early layers, enabling diverse and efficient knowledge updates without repeated external model usage.
Large Language Models-guided Dynamic Adaptation for Temporal Knowledge Graph Reasoning
·2160 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Beijing University of Technology
LLM-DA dynamically adapts LLM-generated rules for accurate, interpretable temporal knowledge graph reasoning, significantly improving accuracy without fine-tuning.
Large Language Models Must Be Taught to Know What They Donβt Know
·3020 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ New York University
Teach LLMs uncertainty for reliable high-stakes predictions: Fine-tuning with graded examples significantly improves LLM’s uncertainty calibration and generalizes well.
Large language model validity via enhanced conformal prediction methods
·2089 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ Stanford University
New conformal inference methods enhance LLM validity by providing adaptive validity guarantees and improving the quality of LLM outputs, addressing prior methods’ limitations.
Large Language Model Unlearning via Embedding-Corrupted Prompts
·7618 words·36 mins·
loading
·
loading
Natural Language Processing
Large Language Models
π’ UC Santa Cruz
ECO prompts enable efficient LLM unlearning by corrupting prompts flagged for forgetting, achieving promising results across various LLMs and tasks with minimal side effects.