Natural Language Processing
Lambda: Learning Matchable Prior For Entity Alignment with Unlabeled Dangling Cases
·2851 words·14 mins·
loading
·
loading
AI Generated
Natural Language Processing
Named Entity Recognition
🏢 Shanghai Jiao Tong University
Lambda: A novel framework tackles entity alignment challenges with unlabeled dangling entities using GNN-based encoding, spectral contrastive learning, and an iterative PU learning algorithm, achievin…
LACIE: Listener-Aware Finetuning for Calibration in Large Language Models
·2396 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 UNC Chapel Hill
LACIE: Listener-aware finetuning improves LLM confidence calibration, reducing incorrect answers accepted by human listeners by 47% while maintaining correct answer acceptance.
KVQuant: Towards 10 Million Context Length LLM Inference with KV Cache Quantization
·5270 words·25 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 UC Berkeley
KVQuant achieves <0.1 perplexity degradation with 3-bit quantization in LLMs by using per-channel key quantization, pre-RoPE quantization, and non-uniform quantization, enabling 10M context length inf…
KV Cache is 1 Bit Per Channel: Efficient Large Language Model Inference with Coupled Quantization
·3037 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Dept. of Computer Science, Rice University
Boost LLM inference speed 1.4-3.5x by using Coupled Quantization (CQ) to compress KV cache down to 1 bit per channel, while preserving model accuracy.
Kraken: Inherently Parallel Transformers For Efficient Multi-Device Inference
·2061 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Princeton University
Kraken: A new Transformer architecture boosts multi-device inference speed by 35.6% by cleverly overlapping communication with computation.
KptLLM: Unveiling the Power of Large Language Model for Keypoint Comprehension
·1673 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Hong Kong
KptLLM: A novel multimodal model leverages LLMs for superior keypoint comprehension, outperforming existing methods in various benchmarks.
Knowledge Circuits in Pretrained Transformers
·3083 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Zhejiang University
Researchers unveil ‘knowledge circuits’ within LLMs, revealing how knowledge is collaboratively encoded and utilized, leading to improved LLM design and interpretations of model behavior.
KnowGPT: Knowledge Graph based Prompting for Large Language Models
·1971 words·10 mins·
loading
·
loading
Natural Language Processing
Question Answering
🏢 Hong Kong Polytechnic University
KnowGPT: A novel framework boosts Large Language Model accuracy by intelligently integrating knowledge graphs, significantly reducing factual errors and achieving near-human performance on benchmark d…
KG-FIT: Knowledge Graph Fine-Tuning Upon Open-World Knowledge
·3104 words·15 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 University of Illinois at Urbana-Champaign
KG-FIT boosts knowledge graph embedding by smartly integrating open-world knowledge from LLMs, achieving significant performance gains.
Kangaroo: Lossless Self-Speculative Decoding for Accelerating LLMs via Double Early Exiting
·2148 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Huawei Noah's Ark Lab
Kangaroo: Double early exiting boosts LLM speed!
JiuZhang3.0: Efficiently Improving Mathematical Reasoning by Training Small Data Synthesis Models
·1722 words·9 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 School of Information, Renmin University of China
JiuZhang3.0 efficiently enhances LLMs’ mathematical reasoning by training a small model to synthesize high-quality training data, drastically reducing costs.
Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
·2559 words·13 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 School of Information Sciences, University of Illinois at Urbana-Champaign
New benchmark and jailbreak method exposes vulnerabilities of LLM moderation, achieving significantly higher success rates than existing methods.
Iterative Reasoning Preference Optimization
·1561 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Meta FAIR
Iterative Reasoning Preference Optimization boosts large language model reasoning by iteratively refining preferences between generated reasoning steps, achieving significant accuracy gains on benchma…
Iteration Head: A Mechanistic Study of Chain-of-Thought
·2483 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Meta AI
Researchers reveal how Chain-of-Thought reasoning emerges in transformers via specialized ‘iteration heads’, improving LLM performance and offering insights into mechanistic interpretability.
Is the MMI Criterion Necessary for Interpretability? Degenerating Non-causal Features to Plain Noise for Self-Rationalization
·1904 words·9 mins·
loading
·
loading
Natural Language Processing
Text Classification
🏢 Huazhong University of Science and Technology
New criterion maximizes remaining discrepancy after rationale removal, treating spurious features as noise, improving rationale extraction.
Is Programming by Example solved by LLMs?
·2523 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Cornell University
Large Language Models (LLMs) surprisingly improve the challenging task of Programming by Example (PBE) when fine-tuned on problem-specific data, outperforming classic symbolic methods and even surpass…
IRCAN: Mitigating Knowledge Conflicts in LLM Generation via Identifying and Reweighting Context-Aware Neurons
·2251 words·11 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 College of Intelligence and Computing, Tianjin University
IRCAN tackles LLM knowledge conflicts by identifying and reweighting context-aware neurons, significantly improving context-sensitive outputs.
IQA-EVAL: Automatic Evaluation of Human-Model Interactive Question Answering
·3104 words·15 mins·
loading
·
loading
AI Generated
Natural Language Processing
Question Answering
🏢 University of Texas at Dallas
IQA-EVAL: An automatic evaluation framework uses LLMs to simulate human-AI interactions and evaluate interactive question answering, achieving high correlation with human judgments.
InversionView: A General-Purpose Method for Reading Information from Neural Activations
·10684 words·51 mins·
loading
·
loading
AI Generated
Natural Language Processing
Large Language Models
🏢 Saarland University
InversionView unveils neural network inner workings by decoding information from activations. It identifies inputs producing similar activations, revealing the information content. Case studies on v…
Invariant Tokenization of Crystalline Materials for Language Model Enabled Generation
·2045 words·10 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Texas A&M University
Mat2Seq revolutionizes crystal structure generation using language models by creating unique, invariant 1D sequences from 3D crystal structures, enabling accurate and efficient crystal discovery with …