Skip to main content

Large Language Models

Parameter-Efficient Fine-Tuning of Large Language Models for Unit Test Generation: An Empirical Study
·1998 words·10 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Norwegian University of Science and Technology
Boosting unit test generation efficiency, this study empirically evaluates various parameter-efficient fine-tuning methods on LLMs, demonstrating comparable performance to full fine-tuning at signific…
Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
·1756 words·9 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tencent AI Lab
Tencent unveils Hunyuan-Large, a groundbreaking open-source MoE LLM boasting 389B parameters and 52B activated parameters, surpassing existing models in performance across various benchmarks.
DynaSaur: Large Language Agents Beyond Predefined Actions
·2738 words·13 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Maryland
DynaSaur: a novel LLM agent framework enabling dynamic action creation, surpassing prior methods with greater flexibility and top performance on the GAIA benchmark.
Sample-Efficient Alignment for LLMs
·2536 words·12 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Sea AI Lab
Sample-efficient LLM alignment achieved via a novel Thompson sampling algorithm (SEA), outperforming existing methods.
Swan and ArabicMTEB: Dialect-Aware, Arabic-Centric, Cross-Lingual, and Cross-Cultural Embedding Models and Benchmarks
·4411 words·21 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of British Columbia
Swan & ArabicMTEB: New dialect-aware Arabic embedding models and benchmark achieve state-of-the-art performance, addressing limitations of existing multilingual models.
LIBMoE: A Library for comprehensive benchmarking Mixture of Experts in Large Language Models
·2387 words·12 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 FPT Software AI Center
LibMoE: A new library streamlines MoE research by offering standardized training, evaluation, and a modular design, enabling efficient benchmarking of various MoE algorithms for LLMs.
Decoding Dark Matter: Specialized Sparse Autoencoders for Interpreting Rare Concepts in Foundation Models
·5414 words·26 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Carnegie Mellon University
Specialized Sparse Autoencoders (SSAEs) decode foundation models’ ‘dark matter’ features, efficiently extracting rare subdomain concepts for improved interpretability and safety.
LLaMo: Large Language Model-based Molecular Graph Assistant
·3401 words·16 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Korea University
LLaMo: a novel large molecular graph-language model seamlessly integrates molecular graph encoders and LLMs, achieving state-of-the-art performance in molecule description generation, property predict…
GlotCC: An Open Broad-Coverage CommonCrawl Corpus and Pipeline for Minority Languages
·1865 words·9 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 LMU Munich & Munich Center for Machine Learning
GlotCC: Open multilingual corpus & pipeline for minority languages, exceeding 1000 languages.
Constraint Back-translation Improves Complex Instruction Following of Large Language Models
·3717 words·18 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Tsinghua University
Constraint Back-translation enhances complex instruction following in LLMs by leveraging inherent constraints in existing datasets for efficient high-quality data creation.
BitStack: Fine-Grained Size Control for Compressed Large Language Models in Variable Memory Environments
·6027 words·29 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Fudan University
BitStack: Dynamic LLM sizing for variable memory!
Controlling Language and Diffusion Models by Transporting Activations
·11502 words·54 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Apple
Steering large language and diffusion models is made easy and efficient via Activation Transport (ACT)! This novel framework uses optimal transport theory to precisely control model activations, leadi…
AAAR-1.0: Assessing AI's Potential to Assist Research
·5113 words·25 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Pennsylvania State University
AAAR-1.0 benchmark rigorously evaluates LLMs’ ability to assist in four core research tasks, revealing both potential and limitations.
NeuZip: Memory-Efficient Training and Inference with Dynamic Compression of Neural Networks
·2943 words·14 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 University of Alberta
NeuZip dynamically compresses neural network weights, achieving memory-efficient training and inference without performance loss, significantly reducing the memory footprint of large language models.
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
·4787 words·23 mins
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Alibaba Group
M2RC-EVAL: A new massively multilingual benchmark for repository-level code completion, featuring fine-grained annotations and a large instruction dataset, enabling better evaluation of code LLMs acro…