🏢 Johns Hopkins University
DINeMo: Learning Neural Mesh Models with no 3D Annotations
·1595 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Johns Hopkins University
DINeMo: Learns 3D models with no 3D annotations, leveraging pseudo-correspondence from visual foundation models for enhanced pose estimation.
AgentRxiv: Towards Collaborative Autonomous Research
·1858 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Applications
Healthcare
🏢 Johns Hopkins University
AgentRxiv enables collaborative autonomous research via LLM agent preprint sharing, boosting performance and discovery.
R2-T2: Re-Routing in Test-Time for Multimodal Mixture-of-Experts
·3310 words·16 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Multimodal Reasoning
🏢 Johns Hopkins University
R2-T2: Boost multimodal MoE performance by re-routing experts in test-time, no retraining needed!
Is That Your Final Answer? Test-Time Scaling Improves Selective Question Answering
·2478 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Question Answering
🏢 Johns Hopkins University
Test-time scaling + confidence = better QA!
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More
·4088 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Classification
🏢 Johns Hopkins University
Smaller image patches improve vision transformer performance, defying conventional wisdom and revealing a new scaling law for enhanced visual understanding.
GenEx: Generating an Explorable World
·2719 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Embodied AI
🏢 Johns Hopkins University
GenEx generates explorable 3D worlds from a single image, enabling embodied AI agents to explore and learn.
Generative World Explorer
·1739 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Johns Hopkins University
Generative World Explorer (Genex) enables agents to imaginatively explore environments, updating beliefs with generated observations for better decision-making.