↓Skip to main content

2025-03-07s

2025

Lost in Literalism: How Supervised Training Shapes Translationese in LLMs

6 March 2025·3432 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Machine Translation 🏢 Shanghai AI Laboratory

LLMs show translationese due to supervised training biases. Polishing references and filtering unnatural instances can mitigate this issue.

IFIR: A Comprehensive Benchmark for Evaluating Instruction-Following in Expert-Domain Information Retrieval

6 March 2025·5266 words·25 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Information Extraction 🏢 School of Advanced Interdisciplinary Sciences, University of Chinese Academy of Sciences

IFIR: a new benchmark for instruction-following retrieval in expert domains, revealing current model limitations.

FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion

6 March 2025·449 words·3 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 School of Computer Science and Engineering, Sun Yat-Sen University, China

FuseChat-3.0: Heterogeneous model fusion boosts LLM performance via preference optimization, creating efficient and powerful language models.

EgoLife: Towards Egocentric Life Assistant

5 March 2025·3562 words·17 mins· loading · loading

AI Generated 🤗 Daily Papers Multimodal Learning Human-AI Interaction 🏢 NTU S-Lab

EgoLife: Ultra-long egocentric dataset & benchmark enabling AI assistants to understand and enhance daily life. Datasets and models released!

LINGOLY-TOO: Disentangling Memorisation from Reasoning with Linguistic Templatisation and Orthographic Obfuscation

4 March 2025·4618 words·22 mins· loading · loading

AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Oxford

LINGOLY-TOO: A new benchmark to disentangle memorization from reasoning in LLMs using linguistic templatization and orthographic obfuscation.

Identifying Sensitive Weights via Post-quantization Integral

28 February 2025·2603 words·13 mins· loading · loading

AI Generated 🤗 Daily Papers Machine Learning Deep Learning 🏢 Tsinghua University

PQI: Accurately identify sensitive weights in post-quantization to enhance LLM compression & performance!