🏢 ByteDance Research
Multi-LLM Debate: Framework, Principals, and Interventions
·1604 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 ByteDance Research
Boosting LLM collaboration, this research introduces a novel theoretical framework for multi-LLM debate, revealing key principles like the effect of similar models and interventions to enhance accurac…
Mitigating Reward Overoptimization via Lightweight Uncertainty Estimation
·1697 words·8 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 ByteDance Research
ADVPO, a novel method, tackles reward overoptimization in RLHF via a lightweight uncertainty quantification approach, resulting in enhanced LLM performance and alignment with human values.
Learning the Optimal Policy for Balancing Short-Term and Long-Term Rewards
·1775 words·9 mins·
loading
·
loading
Machine Learning
Reinforcement Learning
🏢 ByteDance Research
A novel Decomposition-based Policy Learning (DPPL) method optimally balances short-term and long-term rewards, even with interrelated objectives, by transforming the problem into intuitive subproblems…
Classification Done Right for Vision-Language Pre-Training
·1685 words·8 mins·
loading
·
loading
Multimodal Learning
Vision-Language Models
🏢 ByteDance Research
SuperClass, a novel vision-language pre-training method, achieves superior performance on various downstream tasks by directly using tokenized raw text as supervised classification labels, eliminating…
Antigen-Specific Antibody Design via Direct Energy-based Preference Optimization
·2901 words·14 mins·
loading
·
loading
AI Generated
AI Applications
Healthcare
🏢 ByteDance Research
Revolutionizing antibody design, ABDPO uses direct energy-based preference optimization and a pre-trained diffusion model to generate high-quality antibodies with low energy and strong binding affinit…
AGILE: A Novel Reinforcement Learning Framework of LLM Agents
·5046 words·24 mins·
loading
·
loading
AI Generated
Natural Language Processing
Question Answering
🏢 ByteDance Research
AGILE, a novel reinforcement learning framework, significantly enhances LLM agents’ performance on complex conversational tasks by integrating memory, tools, expert interactions, and reflection, outpe…
Achievable Fairness on Your Data With Utility Guarantees
·6805 words·32 mins·
loading
·
loading
AI Generated
AI Theory
Fairness
🏢 ByteDance Research
This paper introduces a computationally efficient method to approximate the optimal accuracy-fairness trade-off curve for various datasets, providing rigorous statistical guarantees and quantifying un…