Skip to main content

🏢 ByteDance Research

Multi-LLM Debate: Framework, Principals, and Interventions
·1604 words·8 mins· loading · loading
Natural Language Processing Large Language Models 🏢 ByteDance Research
Boosting LLM collaboration, this research introduces a novel theoretical framework for multi-LLM debate, revealing key principles like the effect of similar models and interventions to enhance accurac…
Mitigating Reward Overoptimization via Lightweight Uncertainty Estimation
·1697 words·8 mins· loading · loading
Natural Language Processing Large Language Models 🏢 ByteDance Research
ADVPO, a novel method, tackles reward overoptimization in RLHF via a lightweight uncertainty quantification approach, resulting in enhanced LLM performance and alignment with human values.
Learning the Optimal Policy for Balancing Short-Term and Long-Term Rewards
·1775 words·9 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 ByteDance Research
A novel Decomposition-based Policy Learning (DPPL) method optimally balances short-term and long-term rewards, even with interrelated objectives, by transforming the problem into intuitive subproblems…
Classification Done Right for Vision-Language Pre-Training
·1685 words·8 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 ByteDance Research
SuperClass, a novel vision-language pre-training method, achieves superior performance on various downstream tasks by directly using tokenized raw text as supervised classification labels, eliminating…
Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization
·2901 words·14 mins· loading · loading
AI Generated AI Applications Healthcare 🏢 ByteDance Research
Revolutionizing antibody design, ABDPO uses direct energy-based preference optimization and a pre-trained diffusion model to generate high-quality antibodies with low energy and strong binding affinit…
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
·5046 words·24 mins· loading · loading
AI Generated Natural Language Processing Question Answering 🏢 ByteDance Research
AGILE, a novel reinforcement learning framework, significantly enhances LLM agents’ performance on complex conversational tasks by integrating memory, tools, expert interactions, and reflection, outpe…
Achievable Fairness on Your Data With Utility Guarantees
·6805 words·32 mins· loading · loading
AI Generated AI Theory Fairness 🏢 ByteDance Research
This paper introduces a computationally efficient method to approximate the optimal accuracy-fairness trade-off curve for various datasets, providing rigorous statistical guarantees and quantifying un…