Skip to main content

🏢 National Key Laboratory of Human-Machine Hybrid Augmented Intelligence

Referencing Where to Focus: Improving Visual Grounding with Referential Query
·2958 words·14 mins· loading · loading
Multimodal Learning Vision-Language Models 🏢 National Key Laboratory of Human-Machine Hybrid Augmented Intelligence
RefFormer boosts visual grounding accuracy by intelligently adapting queries using multi-level image features, effectively guiding the decoder towards the target object.
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
·2428 words·12 mins· loading · loading
Machine Learning Reinforcement Learning 🏢 National Key Laboratory of Human-Machine Hybrid Augmented Intelligence
Generative world models enhance multi-agent decision-making by simulating trial-and-error learning, improving answer accuracy and explainability.