Natural Language Processing

Imitating Language via Scalable Inverse Reinforcement Learning

26 September 2024·3278 words·16 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Google DeepMind

This study presents a novel Inverse Reinforcement Learning (IRL) approach for fine-tuning large language models, offering improved performance and generation diversity compared to standard methods.

Image-aware Evaluation of Generated Medical Reports

26 September 2024·1591 words·8 mins· loading · loading

Natural Language Processing Text Summarization 🏢 Technion - Israel Institute of Technology

VLScore: a novel image-aware metric revolutionizes medical report evaluation by jointly assessing textual and visual similarities, significantly improving alignment with radiologist assessments.

IF-Font: Ideographic Description Sequence-Following Font Generation

26 September 2024·3165 words·15 mins· loading · loading

AI Generated Natural Language Processing Text Generation 🏢 Fuzhou University

IF-Font: Revolutionary font generation using Ideographic Description Sequences (IDS) to surpass state-of-the-art methods in style transfer, especially for unique styles.

IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation

26 September 2024·2083 words·10 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Tencent AI Lab

IDGen synthesizes LLM evaluation prompts using Item Discrimination theory, creating a more challenging and discriminative dataset than previous methods.

I2EBench: A Comprehensive Benchmark for Instruction-based Image Editing

26 September 2024·1541 words·8 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 Xiamen University

I2EBench: a new benchmark for Instruction-based Image Editing provides a comprehensive evaluation framework using 16 dimensions, aligned with human perception, to evaluate IIE models objectively.

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

26 September 2024·1828 words·9 mins· loading · loading

Natural Language Processing Large Language Models 🏢 HPI / University of Potsdam

Boosting LLM accuracy, a new calibration method using a special [IDK] token explicitly models uncertainty, mitigating hallucinations, and improving factual precision while maintaining knowledge retent…

HYSYNTH: Context-Free LLM Approximation for Guiding Program Synthesis

26 September 2024·2870 words·14 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 UC San Diego

HYSYNTH: A hybrid approach uses LLMs to create context-free surrogate models that guide efficient program synthesis, outperforming LLMs alone and existing synthesizers across multiple domains.

HYDRA: Model Factorization Framework for Black-Box LLM Personalization

26 September 2024·2980 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Georgia Institute of Technology

HYDRA, a novel model factorization framework, significantly improves black-box LLM personalization by capturing both user-specific behavior and shared knowledge, achieving a 9.01% average relative imp…

Hydra: Bidirectional State Space Models Through Generalized Matrix Mixers

26 September 2024·2522 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Carnegie Mellon University

Hydra: Bidirectional sequence modeling redefined with quasiseparable matrix mixers, outperforming existing models on various benchmarks!

HuRef: HUman-REadable Fingerprint for Large Language Models

26 September 2024·2598 words·13 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Shanghai Jiao Tong University

HuRef: Generate unique, human-readable fingerprints for LLMs to protect copyright without exposing model parameters or impeding training.

How Far Can Transformers Reason? The Globality Barrier and Inductive Scratchpad

26 September 2024·3573 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Apple

Transformers struggle with complex reasoning tasks. This paper introduces ‘globality degree’ to measure task difficulty and shows that high globality hinders efficient learning. However, using ‘induc…

How does Architecture Influence the Base Capabilities of Pre-trained Language Models? A Case Study Based on FFN-Wider and MoE Transformers

26 September 2024·3480 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Research Center for Social Computing and Information Retrieval, Harbin Institute of Technology

Pre-trained language models’ base capabilities are significantly influenced by architecture, not just scale; a novel Combination Enhanced Architecture (CEA) improves performance by addressing FFN-Wide…

How do Large Language Models Handle Multilingualism?

26 September 2024·2895 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 DAMO Academy, Alibaba Group, Singapore

LLMs surprisingly process multilingual queries via an English-centric intermediate stage before generating responses in the original language, a phenomenon explained by the proposed MWork framework an…

How Do Large Language Models Acquire Factual Knowledge During Pretraining?

26 September 2024·4632 words·22 mins· loading · loading

Natural Language Processing Large Language Models 🏢 KAIST

LLMs’ factual knowledge acquisition during pretraining is surprisingly non-linear: more data doesn’t guarantee better knowledge retention, and forgetting follows a power law.

HonestLLM: Toward an Honest and Helpful Large Language Model

26 September 2024·3514 words·17 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Peking University

HonestLLM boosts LLM honesty & helpfulness by 65.3% (Llama3-8b) and 124.7% (Mistral-7b) using training-free and fine-tuning methods, establishing principles and a new dataset (HONESET) for honesty eva…

HLM-Cite: Hybrid Language Model Workflow for Text-based Scientific Citation Prediction

26 September 2024·2361 words·12 mins· loading · loading

AI Generated Natural Language Processing Large Language Models 🏢 Tsinghua University

HLM-Cite: A hybrid language model workflow boosts scientific citation prediction accuracy by 17.6% and scales to 100K candidate papers, surpassing existing methods.

HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models

26 September 2024·3025 words·15 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Ohio State University

HippoRAG, a neurobiologically inspired framework, dramatically improves LLM long-term memory and multi-hop question answering by synergistically orchestrating LLMs, knowledge graphs, and the Personali…

HENASY: Learning to Assemble Scene-Entities for Interpretable Egocentric Video-Language Model

26 September 2024·2128 words·10 mins· loading · loading

AI Generated Natural Language Processing Vision-Language Models 🏢 AICV Lab, University of Arkansas

HENASY, a novel egocentric video-language model, uses a compositional approach to assemble scene entities for improved interpretability and performance.

HAWK: Learning to Understand Open-World Video Anomalies

26 September 2024·3198 words·16 mins· loading · loading

Natural Language Processing Vision-Language Models 🏢 Hong Kong University of Science and Technology

HAWK: a novel framework leveraging interactive VLMs and motion modality achieves state-of-the-art performance in open-world video anomaly understanding, generating descriptions and answering questions…

GTBench: Uncovering the Strategic Reasoning Capabilities of LLMs via Game-Theoretic Evaluations

26 September 2024·2898 words·14 mins· loading · loading

Natural Language Processing Large Language Models 🏢 Drexel University

GTBENCH reveals LLMs’ strategic reasoning weaknesses via game-theoretic evaluations, showing strengths in probabilistic scenarios but struggles with deterministic ones; code-pretraining helps.