Large Language Models

Chain of Thoughtlessness? An Analysis of CoT in Planning

26 September 2024·2944 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Arizona State University

Chain of Thought prompting in LLMs offers limited generalizability, providing performance gains only when prompts are highly specific to problem types; highlighting a critical trade-off between perfor…

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

26 September 2024·2704 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Sea AI Lab, Singapore

Chain of Preference Optimization (CPO) dramatically improves LLM reasoning by leveraging ToT’s search tree for efficient fine-tuning, achieving similar or better performance with significantly reduced…

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

26 September 2024·3007 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google Cloud AI Research

Chain-of-Agents (CoA) framework uses multi-agent collaboration to efficiently process long contexts for LLMs, significantly improving performance on various tasks.

Causal language modeling can elicit search and reasoning capabilities on logic puzzles

26 September 2024·2119 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Texas at Austin

LLMs surprisingly master complex logic puzzles like Sudoku and Zebra puzzles after training on strategically ordered solution steps, revealing hidden reasoning abilities.

Cascade Speculative Drafting for Even Faster LLM Inference

26 September 2024·1806 words·9 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Illinois at Urbana-Champaign

Cascade Speculative Drafting (CS Drafting) dramatically speeds up large language model inference by using a multi-stage drafting process, optimizing both time allocation and autoregressive generation.

Can Models Learn Skill Composition from Examples?

26 September 2024·3161 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

Smaller language models can learn skill composition from limited examples, substantially improving their ability to combine skills in novel ways through fine-tuning.

Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study

26 September 2024·5164 words·25 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

LLMs can improve reasoning by teaching weaker models, a process called Learning by Teaching (LbT), as shown in this preliminary study. LbT enhances not just student models, but also the teacher model…

Can LLMs Implicitly Learn Numeric Parameter Constraints in Data Science APIs?

26 September 2024·2762 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Illinois Urbana-Champaign

LLMs struggle to reliably generate valid data science code due to a lack of true understanding of numerical constraints in APIs, despite seemingly mastering common patterns through extensive training.

Can large language models explore in-context?

26 September 2024·4498 words·22 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Microsoft Research

LLMs struggle with in-context exploration, needing substantial prompt engineering or training interventions to effectively explore multi-armed bandit environments.

Can Large Language Model Agents Simulate Human Trust Behavior?

26 September 2024·3567 words·17 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Oxford

LLM agents surprisingly exhibit human-like trust behavior, especially GPT-4, paving the way for simulating complex human interactions in various applications.

Can Language Models Perform Robust Reasoning in Chain-of-thought Prompting with Noisy Rationales?

26 September 2024·7400 words·35 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Hong Kong Baptist University

LLMs struggle with noisy rationales in chain-of-thought prompting. This paper introduces the NoRa dataset, showing that existing methods struggle. A new method, CD-CoT, significantly improves accura…

Can Language Models Learn to Skip Steps?

26 September 2024·2929 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UC Santa Barbara

Language models learn to skip steps in reasoning, improving efficiency and generalization, showcasing emergent human-like cognitive abilities.

Can Graph Learning Improve Planning in LLM-based Agents?

26 September 2024·2929 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Peking University

GNNs enhance LLM-based task planning by improving the ability to process task graphs, surpassing existing solutions even without training.

Calibrating Reasoning in Language Models with Internal Consistency

26 September 2024·2546 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University

LLMs’ reasoning can be improved by using internal consistency to calibrate their outputs.

Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment

26 September 2024·2104 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Artificial Intelligence Research Laboratory, Pennsylvania State University

Cal-DPO calibrates implicit rewards in contrastive preference learning, dramatically improving large language model alignment with human preferences.

Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers

26 September 2024·2873 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 CLAIRE, EPFL

Training large language models efficiently is key; this paper shows how using structured feedforward layers and a novel training regime significantly reduces computational costs and improves training …

Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models

26 September 2024·2186 words·11 mins· loading · loading

Large Language Models 🏢 Peking University

Buffer of Thoughts (BoT) boosts Large Language Model reasoning by storing and reusing high-level ’thought-templates’, achieving significant accuracy and efficiency gains across diverse tasks.

Bridging The Gap between Low-rank and Orthogonal Adaptation via Householder Reflection Adaptation

26 September 2024·2524 words·12 mins· loading · loading

Large Language Models 🏢 Renmin University of China

Householder Reflection Adaptation (HRA) bridges low-rank and orthogonal LLM adaptation, achieving superior performance with fewer parameters than existing methods. By using a chain of Householder refl…

Bridge-IF: Learning Inverse Protein Folding with Markov Bridges

26 September 2024·1691 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Zhejiang University

Bridge-IF, a novel generative diffusion model, excels at inverse protein folding by learning probabilistic dependencies between protein structures and sequences, significantly outperforming existing m…

Boosting the Potential of Large Language Models with an Intelligent Information Assistant

26 September 2024·1837 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tsinghua University

Boosting LLMs with an intelligent information assistant, ASSISTRAG, significantly improves accuracy and reasoning, especially for less advanced models.