Skip to main content

Natural Language Processing

Imitating Language via Scalable Inverse Reinforcement Learning
·3278 words·16 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Google DeepMind
This study presents a novel Inverse Reinforcement Learning (IRL) approach for fine-tuning large language models, offering improved performance and generation diversity compared to standard methods.
Image-aware Evaluation of Generated Medical Reports
·1591 words·8 mins· loading · loading
Natural Language Processing Text Summarization 🏢 Technion - Israel Institute of Technology
VLScore: a novel image-aware metric revolutionizes medical report evaluation by jointly assessing textual and visual similarities, significantly improving alignment with radiologist assessments.
IF-Font: Ideographic Description Sequence-Following Font Generation
·3165 words·15 mins· loading · loading
AI Generated Natural Language Processing Text Generation 🏢 Fuzhou University
IF-Font: Revolutionary font generation using Ideographic Description Sequences (IDS) to surpass state-of-the-art methods in style transfer, especially for unique styles.
IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation
·2083 words·10 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Tencent AI Lab
IDGen synthesizes LLM evaluation prompts using Item Discrimination theory, creating a more challenging and discriminative dataset than previous methods.
I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing
·1541 words·8 mins· loading · loading
Natural Language Processing Vision-Language Models 🏢 Xiamen University
I2EBench: a new benchmark for Instruction-based Image Editing provides a comprehensive evaluation framework using 16 dimensions, aligned with human perception, to evaluate IIE models objectively.
I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token
·1828 words·9 mins· loading · loading
Natural Language Processing Large Language Models 🏢 HPI / University of Potsdam
Boosting LLM accuracy, a new calibration method using a special [IDK] token explicitly models uncertainty, mitigating hallucinations, and improving factual precision while maintaining knowledge retent…
HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis
·2870 words·14 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 UC San Diego
HYSYNTH: A hybrid approach uses LLMs to create context-free surrogate models that guide efficient program synthesis, outperforming LLMs alone and existing synthesizers across multiple domains.
HYDRA: Model Factorization Framework for Black-Box LLM Personalization
·2980 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology
HYDRA, a novel model factorization framework, significantly improves black-box LLM personalization by capturing both user-specific behavior and shared knowledge, achieving a 9.01% average relative imp…
Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers
·2522 words·12 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Carnegie Mellon University
Hydra: Bidirectional sequence modeling redefined with quasiseparable matrix mixers, outperforming existing models on various benchmarks!
HuRef: HUman-REadable Fingerprint for Large Language Models
·2598 words·13 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University
HuRef: Generate unique, human-readable fingerprints for LLMs to protect copyright without exposing model parameters or impeding training.
How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad
·3573 words·17 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Apple
Transformers struggle with complex reasoning tasks. This paper introduces ‘globality degree’ to measure task difficulty and shows that high globality hinders efficient learning. However, using ‘induc…
How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers
·3480 words·17 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology
Pre-trained language models’ base capabilities are significantly influenced by architecture, not just scale; a novel Combination Enhanced Architecture (CEA) improves performance by addressing FFN-Wide…
How do Large Language Models Handle Multilingualism?
·2895 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 DAMO Academy, Alibaba Group, Singapore
LLMs surprisingly process multilingual queries via an English-centric intermediate stage before generating responses in the original language, a phenomenon explained by the proposed MWork framework an…
How Do Large Language Models Acquire Factual Knowledge During Pretraining?
·4632 words·22 mins· loading · loading
Natural Language Processing Large Language Models 🏢 KAIST
LLMs’ factual knowledge acquisition during pretraining is surprisingly non-linear: more data doesn’t guarantee better knowledge retention, and forgetting follows a power law.
HonestLLM: Toward an Honest and Helpful Large Language Model
·3514 words·17 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Peking University
HonestLLM boosts LLM honesty & helpfulness by 65.3% (Llama3-8b) and 124.7% (Mistral-7b) using training-free and fine-tuning methods, establishing principles and a new dataset (HONESET) for honesty eva…
HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation Prediction
·2361 words·12 mins· loading · loading
AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University
HLM-Cite: A hybrid language model workflow boosts scientific citation prediction accuracy by 17.6% and scales to 100K candidate papers, surpassing existing methods.
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models
·3025 words·15 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Ohio State University
HippoRAG, a neurobiologically inspired framework, dramatically improves LLM long-term memory and multi-hop question answering by synergistically orchestrating LLMs, knowledge graphs, and the Personali…
HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model
·2128 words·10 mins· loading · loading
AI Generated Natural Language Processing Vision-Language Models 🏢 AICV Lab, University of Arkansas
HENASY, a novel egocentric video-language model, uses a compositional approach to assemble scene entities for improved interpretability and performance.
HAWK: Learning to Understand Open-World Video Anomalies
·3198 words·16 mins· loading · loading
Natural Language Processing Vision-Language Models 🏢 Hong Kong University of Science and Technology
HAWK: a novel framework leveraging interactive VLMs and motion modality achieves state-of-the-art performance in open-world video anomaly understanding, generating descriptions and answering questions…
GTBench: Uncovering the Strategic Reasoning Capabilities of LLMs via Game-Theoretic Evaluations
·2898 words·14 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Drexel University
GTBENCH reveals LLMs’ strategic reasoning weaknesses via game-theoretic evaluations, showing strengths in probabilistic scenarios but struggles with deterministic ones; code-pretraining helps.