🏢 ByteDance Research

Multi-LLM Debate: Framework, Principals, and Interventions

26 September 2024·1604 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 ByteDance Research

Boosting LLM collaboration, this research introduces a novel theoretical framework for multi-LLM debate, revealing key principles like the effect of similar models and interventions to enhance accurac…

Mitigating Reward Overoptimization via Lightweight Uncertainty Estimation

26 September 2024·1697 words·8 mins· loading · loading

Natural Language Processing Large Language Models 🏢 ByteDance Research

ADVPO, a novel method, tackles reward overoptimization in RLHF via a lightweight uncertainty quantification approach, resulting in enhanced LLM performance and alignment with human values.

Learning the Optimal Policy for Balancing Short-Term and Long-Term Rewards

26 September 2024·1775 words·9 mins· loading · loading

Machine Learning Reinforcement Learning 🏢 ByteDance Research

A novel Decomposition-based Policy Learning (DPPL) method optimally balances short-term and long-term rewards, even with interrelated objectives, by transforming the problem into intuitive subproblems…

Classification Done Right for Vision-Language Pre-Training

26 September 2024·1685 words·8 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 ByteDance Research

SuperClass, a novel vision-language pre-training method, achieves superior performance on various downstream tasks by directly using tokenized raw text as supervised classification labels, eliminating…

Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization

26 September 2024·2901 words·14 mins· loading · loading

AI Generated AI Applications Healthcare 🏢 ByteDance Research

Revolutionizing antibody design, ABDPO uses direct energy-based preference optimization and a pre-trained diffusion model to generate high-quality antibodies with low energy and strong binding affinit…

AGILE: A Novel Reinforcement Learning Framework of LLM Agents

26 September 2024·5046 words·24 mins· loading · loading

AI Generated Natural Language Processing Question Answering 🏢 ByteDance Research

AGILE, a novel reinforcement learning framework, significantly enhances LLM agents’ performance on complex conversational tasks by integrating memory, tools, expert interactions, and reflection, outpe…

Achievable Fairness on Your Data With Utility Guarantees

26 September 2024·6805 words·32 mins· loading · loading

AI Generated AI Theory Fairness 🏢 ByteDance Research

This paper introduces a computationally efficient method to approximate the optimal accuracy-fairness trade-off curve for various datasets, providing rigorous statistical guarantees and quantifying un…