Large Language Models
SCBench: A KV Cache-Centric Analysis of Long-Context Methods
·5380 words·26 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Microsoft Corporation
New benchmark for evaluating long-context models finds sub-O(n) methods lacking in real-world use cases.
Byte Latent Transformer: Patches Scale Better Than Tokens
·4848 words·23 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 University of Washington
BLT: tokenizer-free LLM for efficiency and robustness
The Impact of Copyrighted Material on Large Language Models: A Norwegian Perspective
·1893 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 National Library of Norway
Norwegians show that using copyrighted material improves LLMs, but raises legal and ethical issues.
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios
·3495 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 UC Santa Barbara
RULEARENA, a new benchmark, rigorously evaluates large language models’ ability to apply complex, real-world rules across diverse scenarios, revealing significant shortcomings in current LLMs’ rule-gu…
Phi-4 Technical Report
·2630 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Microsoft Research
Phi-4: a 14B parameter LLM surpassing its teacher model (GPT-4) in STEM-focused QA through innovative synthetic data generation and post-training techniques.
JuStRank: Benchmarking LLM Judges for System Ranking
·13985 words·66 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 IBM Research
JuStRank: LLM system ranker benchmark reveals critical judge qualities (decisiveness, bias) impacting ranking accuracy, highlighting instance-level performance doesn’t guarantee accurate system-level…
SmolTulu: Higher Learning Rate to Batch Size Ratios Can Lead to Better Reasoning in SLMs
·2774 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Saudi Data & Artificial Intelligence Authority
Fine-tuning small language models? Tweak the learning rate and batch size for a reasoning boost!
Granite Guardian
·4191 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 IBM Research
Granite Guardian: Open-source risk detection models for LLMs, surpassing existing models in accuracy and offering comprehensive coverage across multiple risk dimensions, promoting safer AI.
Contextualized Counterspeech: Strategies for Adaptation, Personalization, and Evaluation
·1928 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 University of Pisa
Contextualized AI counterspeech significantly outperforms generic methods by adapting to the moderation context and user, improving persuasiveness without sacrificing other qualities.
Training Large Language Models to Reason in a Continuous Latent Space
·2859 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Meta AI
LLMs are trained to reason using language, but COCONUT lets them reason directly in a continuous latent space, boosting performance on logical tasks requiring complex planning.
Fully Open Source Moxin-7B Technical Report
·334 words·2 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Northeastern University
Moxin-LLM: A fully open-source 7B parameter LLM achieving superior zero-shot performance, promoting transparency and reproducibility in AI research.
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases
·5961 words·28 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 LG AI Research
LG AI Research unveils EXAONE 3.5, a series of instruction-tuned language models (2.4B, 7.8B, and 32B parameters) excelling in real-world tasks, long-context understanding, and general benchmarks.
Evaluating and Aligning CodeLLMs on Human Preference
·3535 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Alibaba Group
CodeArena, a novel benchmark, evaluates code LLMs based on human preferences, revealing performance gaps between open-source and proprietary models, and a large-scale synthetic instruction corpus impr…
Monet: Mixture of Monosemantic Experts for Transformers
·5131 words·25 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Korea University
MONET improves Transformer interpretability by using Mixture-of-Experts (MoE) with 262K monosemantic experts per layer, achieving parameter efficiency and enabling knowledge manipulation without perfo…
Marco-LLM: Bridging Languages via Massive Multilingual Training for Cross-Lingual Enhancement
·7239 words·34 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Alibaba International Digital Commerce
Marco-LLM: A groundbreaking multilingual LLM significantly enhances cross-lingual capabilities via massive multilingual training, bridging the performance gap between high- and low-resource languages.
Densing Law of LLMs
·1976 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Tsinghua University
LLMs’ training quality is exponentially improving, enabling models with half the parameters to match state-of-the-art performance every 3 months, thus reducing inference costs.
Weighted-Reward Preference Optimization for Implicit Model Fusion
·4595 words·22 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 School of Computer Science and Engineering, Sun Yat-Sen University
WRPO: Implicitly fuse LLMs, boosting performance without complex alignment or merging!
Robust Multi-bit Text Watermark with LLM-based Paraphrasers
·3046 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 ByteDance Research
Researchers developed a robust multi-bit text watermarking method using LLMs for paraphrasing, achieving over 99.99% detection accuracy while maintaining semantic information and resisting common atta…
Evaluating Language Models as Synthetic Data Generators
·4403 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Carnegie Mellon University
AGORABENCH: A new benchmark reveals surprising strengths & weaknesses of LMs as synthetic data generators, showing that problem-solving ability isn’t the sole indicator of data quality.
OCR Hinders RAG: Evaluating the Cascading Impact of OCR on Retrieval-Augmented Generation
·4800 words·23 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Peking University
Imperfect OCR hinders Retrieval-Augmented Generation (RAG). OHRBench, a new benchmark, reveals this cascading impact, showing current OCR solutions insufficient for high-quality RAG knowledge bases. …