Large Language Models

Jailbreaking with Universal Multi-Prompts

3 February 2025·5963 words·28 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National Taiwan University

JUMP outperforms existing methods by optimizing universal multi-prompts for jailbreaking LLMs, offering a more efficient and generalizable approach to LLM adversarial attacks.

FastKV: KV Cache Compression for Fast Long-Context Processing with Token-Selective Propagation

3 February 2025·4575 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Department of Electrical and Computer Engineering, Seoul National University

FastKV: A novel KV cache compression method speeds up long-context LLM processing 2x by selectively propagating tokens and using GQA-aware compression, maintaining accuracy.

Almost Surely Safe Alignment of Large Language Models at Inference-Time

3 February 2025·2605 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Peking University

InferenceGuard ensures almost-sure safe LLM responses at inference time by framing safe generation as a constrained Markov Decision Process in the LLM’s latent space, achieving high safety rates witho…

A Probabilistic Inference Approach to Inference-Time Scaling of LLMs using Particle-Based Monte Carlo Methods

3 February 2025·2572 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 MIT

Boosting Large Language Model (LLM) inference speed using probabilistic inference via particle-based Monte Carlo methods achieves 4-16x better scaling than deterministic search approaches.

WILDCHAT-50M: A Deep Dive Into the Role of Synthetic Data in Post-Training

30 January 2025·3471 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NYU

WILDCHAT-50M: Largest public chat dataset refines LLM post-training, showing superior SFT performance with fewer samples.

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

30 January 2025·2085 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tencent AI Lab

Large language models (LLMs) often prematurely abandon promising reasoning paths, a phenomenon called ‘underthinking’. This paper introduces a novel metric to quantify this issue and proposes a decodi…

GuardReasoner: Towards Reasoning-based LLM Safeguards

30 January 2025·5624 words·27 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National University of Singapore

GuardReasoner enhances LLM safety with reasoning-based guardrails, improving performance, explainability, and generalization on various benchmarks.

Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

29 January 2025·3468 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology

Virus: A new attack method easily bypasses LLM guardrails, highlighting the inadequacy of current safety measures and urging for more robust solutions.

Large Language Models Think Too Fast To Explore Effectively

29 January 2025·3497 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology

Large language models underperform humans in open-ended exploration due to prioritizing immediate choices over long-term strategic thinking, but innovative models show promise.

Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate

29 January 2025·2552 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

Critique Fine-Tuning (CFT) outperforms traditional supervised fine-tuning (SFT) in training language models, achieving comparable results with significantly less data and opening new avenues in AI.

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

28 January 2025·3663 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Berkeley

Reinforcement learning (RL) surpasses supervised fine-tuning (SFT) in fostering generalization in foundation models, while SFT aids RL’s stability; a comparative study across text and visual domains r…

SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model

28 January 2025·4043 words·19 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing

SafeRAG: A new benchmark exposes critical security vulnerabilities in Retrieval-Augmented Generation (RAG) systems by introducing four novel attack types and a comprehensive dataset for evaluation, re…

Over-Tokenized Transformer: Vocabulary is Generally Worth Scaling

28 January 2025·3794 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Seed-Foundation-Model Team, Bytedance

Boosting Large Language Model (LLM) performance, researchers introduce Over-Tokenized Transformers, decoupling input/output vocabularies to improve language modeling. Scaling input vocabularies improv…

Optimizing Large Language Model Training Using FP4 Quantization

28 January 2025·1562 words·8 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Microsoft Research

First-ever FP4 training framework for LLMs achieves accuracy comparable to BF16 and FP8, enabling efficient ultra-low precision training.

Histoires Morales: A French Dataset for Assessing Moral Alignment

28 January 2025·8270 words·39 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Laboratoire Hubert Curien

HISTOIRESMORALES: a new French dataset tackles the crucial issue of aligning language models with human moral values, providing valuable resources for ethical AI research in a previously underserved l…

IndicMMLU-Pro: Benchmarking Indic Large Language Models on Multi-Task Language Understanding

27 January 2025·2564 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Artificial Intelligence Institute, University of South Carolina

IndicMMLU-Pro: a new benchmark rigorously evaluates large language models’ multi-task language understanding capabilities across nine major Indian languages, pushing Indic language AI research forward…

Atla Selene Mini: A General Purpose Evaluation Model

27 January 2025·1893 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Atla

Atla Selene Mini: A state-of-the-art small LLM judge surpassing larger models in benchmark performance!

ARWKV: Pretrain is not what we need, an RNN-Attention-Based Language Model Born from Transformer

26 January 2025·1758 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Peking University

ARWKV: A novel RNN-attention-based language model, distilled from a larger model, achieves strong performance using significantly fewer resources, opening a new path in efficient language model develo…

RealCritic: Towards Effectiveness-Driven Evaluation of Language Model Critiques

24 January 2025·2423 words·12 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 the Chinese University of Hong Kong, Shenzhen

RealCritic: A new benchmark effectively evaluates language models’ critique abilities using a closed-loop methodology, showcasing advanced reasoning models’ superiority in self and iterative critique.

Humanity's Last Exam

24 January 2025·2314 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Center for AI Safety

Humanity’s Last Exam (HLE): a groundbreaking multi-modal benchmark pushing the boundaries of large language model (LLM) capabilities, revealing a significant gap between current LLMs and human experts…