Large Language Models

How to Synthesize Text Data without Model Collapse?

19 December 2024·5702 words·27 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

Token-level editing prevents language model collapse from synthetic data by theoretically bounding test error and empirically improving model performance.

Fietje: An open, efficient LLM for Dutch

19 December 2024·3094 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KU Leuven

Fietje: an open-source, efficient Dutch language model outperforming larger models.

AceMath: Advancing Frontier Math Reasoning with Post-Training and Reward Modeling

19 December 2024·3123 words·15 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 NVIDIA Research

AceMath achieves state-of-the-art results in mathematical reasoning by introducing highly effective instruction-tuned models and reward models.

TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks

18 December 2024·2677 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

AI agents are tested in a simulated company, revealing their capability to automate tasks and shortcomings with complex workflows and interfaces.

RAG-RewardBench: Benchmarking Reward Models in Retrieval Augmented Generation for Preference Alignment

18 December 2024·4393 words·21 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 School of Artificial Intelligence, University of Chinese Academy of Sciences

First benchmark for RAG reward models reveals their limitations and the need for preference-aligned training.

Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN

18 December 2024·2716 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Surrey

Mix-LN boosts deep layer power in LLMs.

AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge

18 December 2024·2611 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Nanyang Technological University

Auto-built benchmark with up-to-date knowledge ensures contamination-free LLM evaluation.

Are Your LLMs Capable of Stable Reasoning?

17 December 2024·2140 words·11 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Shanghai AI Laboratory

G-Pass@k & LiveMathBench: Evaluating the stability of LLMs.

SPaR: Self-Play with Tree-Search Refinement to Improve Instruction-Following in Large Language Models

16 December 2024·3747 words·18 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University

Self-play method SPAR enhances LLMs instruction following abilities, beating GPT-4 on IFEval

SepLLM: Accelerate Large Language Models by Compressing One Segment into One Separator

16 December 2024·3575 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Huawei Noah's Ark Lab

SepLLM shrinks LLMs, speeding them up by over 50% without losing much accuracy.

Smaller Language Models Are Better Instruction Evolvers

15 December 2024·5507 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Beijing University of Posts and Telecommunications

Smaller is better: SLMs outperform LLMs in evolving complex & diverse instructions for AI training.

SCBench: A KV Cache-Centric Analysis of Long-Context Methods

13 December 2024·5380 words·26 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Microsoft Corporation

New benchmark for evaluating long-context models finds sub-O(n) methods lacking in real-world use cases.

Byte Latent Transformer: Patches Scale Better Than Tokens

13 December 2024·4848 words·23 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Washington

BLT: tokenizer-free LLM for efficiency and robustness

The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective

12 December 2024·1893 words·9 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 National Library of Norway

Norwegians show that using copyrighted material improves LLMs, but raises legal and ethical issues.

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios

12 December 2024·3495 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Santa Barbara

RULEARENA, a new benchmark, rigorously evaluates large language models’ ability to apply complex, real-world rules across diverse scenarios, revealing significant shortcomings in current LLMs’ rule-gu…

Phi-4 Technical Report

12 December 2024·2630 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Microsoft Research

Phi-4: a 14B parameter LLM surpassing its teacher model (GPT-4) in STEM-focused QA through innovative synthetic data generation and post-training techniques.

JuStRank: Benchmarking LLM Judges for System Ranking

12 December 2024·13985 words·66 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 IBM Research

JuStRank: LLM system ranker benchmark reveals critical judge qualities (decisiveness, bias) impacting ranking accuracy, highlighting instance-level performance doesn’t guarantee accurate system-level…

SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs

11 December 2024·2774 words·14 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Saudi Data & Artificial Intelligence Authority

Fine-tuning small language models? Tweak the learning rate and batch size for a reasoning boost!

Granite Guardian

10 December 2024·4191 words·20 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 IBM Research

Granite Guardian: Open-source risk detection models for LLMs, surpassing existing models in accuracy and offering comprehensive coverage across multiple risk dimensions, promoting safer AI.

Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation

10 December 2024·1928 words·10 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Pisa

Contextualized AI counterspeech significantly outperforms generic methods by adapting to the moderation context and user, improving persuasiveness without sacrificing other qualities.