Natural Language Processing

Bileve: Securing Text Provenance in Large Language Models Against Spoofing with Bi-level Signature

26 September 2024·2033 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Northeastern University

Bileve: a novel bi-level signature secures text provenance in LLMs against spoofing, enhancing detectability and reliability via fine-grained integrity checks and coarse-grained source tracing.

Bias Amplification in Language Model Evolution: An Iterated Learning Perspective

26 September 2024·3378 words·16 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UBC

LLMs’ iterative interactions amplify subtle biases; this paper uses a Bayesian Iterated Learning framework to explain this phenomenon and offers strategies to guide LLM evolution.

Beyond Accuracy: Ensuring Correct Predictions With Correct Rationales

26 September 2024·2877 words·14 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 Department of Computer & Information Science, University of Delaware

This research introduces a novel two-phase approach to improve AI model trustworthiness by ensuring both correct predictions and correct rationales. A new dataset with structured rationales and a rat…

BERTs are Generative In-Context Learners

26 September 2024·2108 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Language Technology Group, University of Oslo

Masked language models can perform in-context learning, challenging the dominance of causal models in this area.

Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs

26 September 2024·2673 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 University of Maryland

Goldfish Loss: A novel training method for LLMs dramatically reduces memorization without impacting performance, addressing key safety, privacy, and copyright concerns.

Base of RoPE Bounds Context Length

26 September 2024·3176 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Baichuan Inc.

LLM long-context ability is fundamentally limited by RoPE’s base parameter, which determines an absolute lower bound for achievable context length.

BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts

26 September 2024·1807 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Oxford

BAM! Efficiently upcycles pre-trained models into powerful Mixture-of-Experts (MoE) models, achieving state-of-the-art performance with reduced computational costs.

BAdam: A Memory Efficient Full Parameter Optimization Method for Large Language Models

26 September 2024·2359 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Chinese University of Hong Kong, Shenzhen

BAdam: A memory-efficient optimization method enabling full parameter fine-tuning of large language models using a block coordinate descent framework with Adam’s update rule, achieving comparable or s…

BackdoorAlign: Mitigating Fine-tuning based Jailbreak Attack with Backdoor Enhanced Safety Alignment

26 September 2024·2859 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Wisconsin-Madison

BackdoorAlign defends against fine-tuning-based LLM jailbreaks using a ‘backdoor trigger’ to enforce safety alignment during inference, effectively mitigating risks with minimal additional safety exam…

B'MOJO: Hybrid State Space Realizations of Foundation Models with Eidetic and Fading Memory

26 September 2024·1821 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 AWS AI Labs

B’MOJO: A novel hybrid architecture for foundation models enhances transductive inference by dynamically balancing eidetic and fading memory, leading to efficient and accurate processing of long seque…

AvaTaR: Optimizing LLM Agents for Tool Usage via Contrastive Reasoning

26 September 2024·3104 words·15 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Stanford University

AVATAR: A novel automated framework optimizes LLM agents for effective tool usage via contrastive reasoning, significantly boosting performance on complex tasks.

AutoTimes: Autoregressive Time Series Forecasters via Large Language Models

26 September 2024·5046 words·24 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

AutoTimes repurposes LLMs as autoregressive time series forecasters, achieving state-of-the-art results with minimal trainable parameters and faster training/inference.

AutoSurvey: Large Language Models Can Automatically Write Surveys

26 September 2024·2587 words·13 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

AutoSurvey automates comprehensive literature survey creation using LLMs, overcoming challenges of context limitations and knowledge constraints via a novel, efficient, and rigorously evaluated method…

AutoPSV: Automated Process-Supervised Verifier

26 September 2024·2548 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Hong Kong

AutoPSV automates process annotation for LLMs, improving reasoning by detecting confidence shifts in reasoning steps, thus efficiently enhancing model performance.

Autonomous Agents for Collaborative Task under Information Asymmetry

26 September 2024·3171 words·15 mins· loading · loading

AI Generated Natural Language Processing Dialogue Systems 🏢 Tsinghua University

iAgents: a novel multi-agent system leveraging LLMs, overcomes information asymmetry by mirroring human social networks to enable effective collaboration in complex tasks, achieving high accuracy in d…

AutoMix: Automatically Mixing Language Models

26 September 2024·2953 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

AutoMix intelligently routes queries to different-sized LLMs based on a smaller model’s self-verification, minimizing cost while maintaining performance.

AutoManual: Generating Instruction Manuals by LLM Agents via Interactive Environmental Learning

26 September 2024·2527 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Hangzhou Dianzi University

LLM agents can now autonomously build environmental understanding via interactive learning, generating human-readable instruction manuals that boost task success rates.

AutoGuide: Automated Generation and Selection of Context-Aware Guidelines for Large Language Model Agents

26 September 2024·2274 words·11 mins· loading · loading

Natural Language Processing Large Language Models 🏢 University of Michigan

AutoGuide: Automated generation of context-aware guidelines significantly improves LLM agent performance in unfamiliar domains.

Autoformalize Mathematical Statements by Symbolic Equivalence and Semantic Consistency

26 September 2024·2235 words·11 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

Boosting AI’s math skills, this paper introduces a novel framework for autoformalizing mathematical statements, improving accuracy by 0.22-1.35x via symbolic equivalence and semantic consistency check…

Ask, Attend, Attack: An Effective Decision-Based Black-Box Targeted Attack for Image-to-Text Models

26 September 2024·3219 words·16 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 Xiamen University

This paper introduces AAA, a novel three-stage decision-based black-box targeted attack against image-to-text models. AAA efficiently generates semantically consistent adversarial examples by asking …