Natural Language Processing

CoMERA: Computing- and Memory-Efficient Training via Rank-Adaptive Tensor Optimization

26 September 2024·2398 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University at Albany, SUNY

CoMERA achieves 2-3x faster AI model training via rank-adaptive tensor optimization, significantly improving both computing and memory efficiency.

Combining Observational Data and Language for Species Range Estimation

26 September 2024·2627 words·13 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 UMass Amherst University

LE-SINR combines Wikipedia species descriptions with citizen science observations to create accurate species range maps, even with limited data, outperforming existing methods.

COLD: Causal reasOning in cLosed Daily activities

26 September 2024·3472 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Indian Institute of Technology Kanpur

COLD framework rigorously evaluates LLMs’ causal reasoning in everyday scenarios using 9 million causal queries derived from human-generated scripts of daily activities.

Coevolving with the Other You: Fine-Tuning LLM with Sequential Cooperative Multi-Agent Reinforcement Learning

26 September 2024·2454 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

CORY: a novel multi-agent RL framework boosts LLM fine-tuning!

CodeRosetta: Pushing the Boundaries of Unsupervised Code Translation for Parallel Programming

26 September 2024·5423 words·26 mins· loading · loading

AI Generated Natural Language Processing Machine Translation 🏢 Iowa State University

Code Rosetta pushes the boundaries of unsupervised code translation by introducing the first encoder-decoder model that efficiently translates between programming languages and their parallel HPC exte…

Code Repair with LLMs gives an Exploration-Exploitation Tradeoff

26 September 2024·3695 words·18 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Cornell University

New program synthesis method, REX, leverages Thompson Sampling to balance exploration and exploitation in iterative LLM code refinement, solving more problems with fewer model calls.

CLUES: Collaborative Private-domain High-quality Data Selection for LLMs via Training Dynamics

26 September 2024·2368 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Cambridge

CLUES: Collaborative learning selects high-quality private data for LLM fine-tuning via training dynamics, significantly boosting performance in diverse domains.

CigTime: Corrective Instruction Generation Through Inverse Motion Editing

26 September 2024·2228 words·11 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 Hong Kong University of Science and Technology

CigTime generates corrective motion instructions from motion pairs using motion editing and large language models. This innovative approach improves upon baselines by leveraging motion triplets for f…

Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models

26 September 2024·1676 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Shanghai University of Finance and Economics

CherryQ, a novel quantization method, leverages parameter heterogeneity in LLMs to achieve superior performance by selectively quantizing less critical parameters while preserving essential ones.

ChatTracker: Enhancing Visual Tracking Performance via Chatting with Multimodal Large Language Model

26 September 2024·1826 words·9 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 East China Normal University

ChatTracker boosts visual tracking by intelligently using a large language model to refine object descriptions, achieving performance on par with state-of-the-art methods.

ChatQA: Surpassing GPT-4 on Conversational QA and RAG

26 September 2024·4802 words·23 mins· loading · loading

AI Generated Natural Language Processing Question Answering 🏢 NVIDIA

ChatQA, a new suite of models, outperforms GPT-4 in conversational QA and RAG by using a two-stage instruction tuning method and a cost-effective dense retriever.

Chat-Scene: Bridging 3D Scene and Large Language Models with Object Identifiers

26 September 2024·2344 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Zhejiang University

Chat-Scene: Bridging 3D scenes and LLMs using object identifiers for efficient, object-level interaction and improved scene comprehension.

Chain-of-Thought Reasoning Without Prompting

26 September 2024·2324 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google DeepMind

LLMs can reason effectively without prompting by simply adjusting the decoding process to reveal inherent chain-of-thought paths.

Chain of Thoughtlessness? An Analysis of CoT in Planning

26 September 2024·2944 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Arizona State University

Chain of Thought prompting in LLMs offers limited generalizability, providing performance gains only when prompts are highly specific to problem types; highlighting a critical trade-off between perfor…

Chain of Preference Optimization: Improving Chain-of-Thought Reasoning in LLMs

26 September 2024·2704 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Sea AI Lab, Singapore

Chain of Preference Optimization (CPO) dramatically improves LLM reasoning by leveraging ToT’s search tree for efficient fine-tuning, achieving similar or better performance with significantly reduced…

Chain of Agents: Large Language Models Collaborating on Long-Context Tasks

26 September 2024·3007 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Google Cloud AI Research

Chain-of-Agents (CoA) framework uses multi-agent collaboration to efficiently process long contexts for LLMs, significantly improving performance on various tasks.

Causal language modeling can elicit search and reasoning capabilities on logic puzzles

26 September 2024·2119 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Texas at Austin

LLMs surprisingly master complex logic puzzles like Sudoku and Zebra puzzles after training on strategically ordered solution steps, revealing hidden reasoning abilities.

Cascade Speculative Drafting for Even Faster LLM Inference

26 September 2024·1806 words·9 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Illinois at Urbana-Champaign

Cascade Speculative Drafting (CS Drafting) dramatically speeds up large language model inference by using a multi-stage drafting process, optimizing both time allocation and autoregressive generation.

Can Models Learn Skill Composition from Examples?

26 September 2024·3161 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

Smaller language models can learn skill composition from limited examples, substantially improving their ability to combine skills in novel ways through fine-tuning.

Can LLMs Learn by Teaching for Better Reasoning? A Preliminary Study

26 September 2024·5164 words·25 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

LLMs can improve reasoning by teaching weaker models, a process called Learning by Teaching (LbT), as shown in this preliminary study. LbT enhances not just student models, but also the teacher model…