Natural Language Processing

Lambda: Learning Matchable Prior For Entity Alignment with Unlabeled Dangling Cases

26 September 2024·2851 words·14 mins· loading · loading

AI Generated Natural Language Processing Named Entity Recognition 🏢 Shanghai Jiao Tong University

Lambda: A novel framework tackles entity alignment challenges with unlabeled dangling entities using GNN-based encoding, spectral contrastive learning, and an iterative PU learning algorithm, achievin…

LACIE: Listener-Aware Finetuning for Calibration in Large Language Models

26 September 2024·2396 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UNC Chapel Hill

LACIE: Listener-aware finetuning improves LLM confidence calibration, reducing incorrect answers accepted by human listeners by 47% while maintaining correct answer acceptance.

KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization

26 September 2024·5270 words·25 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC Berkeley

KVQuant achieves <0.1 perplexity degradation with 3-bit quantization in LLMs by using per-channel key quantization, pre-RoPE quantization, and non-uniform quantization, enabling 10M context length inf…

KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization

26 September 2024·3037 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Dept. of Computer Science, Rice University

Boost LLM inference speed 1.4-3.5x by using Coupled Quantization (CQ) to compress KV cache down to 1 bit per channel, while preserving model accuracy.

Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference

26 September 2024·2061 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Princeton University

Kraken: A new Transformer architecture boosts multi-device inference speed by 35.6% by cleverly overlapping communication with computation.

KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension

26 September 2024·1673 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Hong Kong

KptLLM: A novel multimodal model leverages LLMs for superior keypoint comprehension, outperforming existing methods in various benchmarks.

Knowledge Circuits in Pretrained Transformers

26 September 2024·3083 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Zhejiang University

Researchers unveil ‘knowledge circuits’ within LLMs, revealing how knowledge is collaboratively encoded and utilized, leading to improved LLM design and interpretations of model behavior.

KnowGPT: Knowledge Graph based Prompting for Large Language Models

26 September 2024·1971 words·10 mins· loading · loading

Natural Language Processing Question Answering 🏢 Hong Kong Polytechnic University

KnowGPT: A novel framework boosts Large Language Model accuracy by intelligently integrating knowledge graphs, significantly reducing factual errors and achieving near-human performance on benchmark d…

KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge

26 September 2024·3104 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Illinois at Urbana-Champaign

KG-FIT boosts knowledge graph embedding by smartly integrating open-world knowledge from LLMs, achieving significant performance gains.

Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting

26 September 2024·2148 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Huawei Noah's Ark Lab

Kangaroo: Double early exiting boosts LLM speed!

JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models

26 September 2024·1722 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 School of Information, Renmin University of China

JiuZhang3.0 efficiently enhances LLMs’ mathematical reasoning by training a small model to synthesize high-quality training data, drastically reducing costs.

Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters

26 September 2024·2559 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 School of Information Sciences, University of Illinois at Urbana-Champaign

New benchmark and jailbreak method exposes vulnerabilities of LLM moderation, achieving significantly higher success rates than existing methods.

Iterative Reasoning Preference Optimization

26 September 2024·1561 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Meta FAIR

Iterative Reasoning Preference Optimization boosts large language model reasoning by iteratively refining preferences between generated reasoning steps, achieving significant accuracy gains on benchma…

Iteration Head: A Mechanistic Study of Chain-of-Thought

26 September 2024·2483 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Meta AI

Researchers reveal how Chain-of-Thought reasoning emerges in transformers via specialized ‘iteration heads’, improving LLM performance and offering insights into mechanistic interpretability.

Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization

26 September 2024·1904 words·9 mins· loading · loading

Natural Language Processing Text Classification 🏢 Huazhong University of Science and Technology

New criterion maximizes remaining discrepancy after rationale removal, treating spurious features as noise, improving rationale extraction.

Is Programming by Example solved by LLMs?

26 September 2024·2523 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Cornell University

Large Language Models (LLMs) surprisingly improve the challenging task of Programming by Example (PBE) when fine-tuned on problem-specific data, outperforming classic symbolic methods and even surpass…

IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons

26 September 2024·2251 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 College of Intelligence and Computing, Tianjin University

IRCAN tackles LLM knowledge conflicts by identifying and reweighting context-aware neurons, significantly improving context-sensitive outputs.

IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering

26 September 2024·3104 words·15 mins· loading · loading

AI Generated Natural Language Processing Question Answering 🏢 University of Texas at Dallas

IQA-EVAL: An automatic evaluation framework uses LLMs to simulate human-AI interactions and evaluate interactive question answering, achieving high correlation with human judgments.

InversionView: A General-Purpose Method for Reading Information from Neural Activations

26 September 2024·10684 words·51 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Saarland University

InversionView unveils neural network inner workings by decoding information from activations. It identifies inputs producing similar activations, revealing the information content. Case studies on v…

Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation

26 September 2024·2045 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Texas A&M University

Mat2Seq revolutionizes crystal structure generation using language models by creating unique, invariant 1D sequences from 3D crystal structures, enabling accurate and efficient crystal discovery with …