🏢 Zhejiang University
InfiGUIAgent: A Multimodal Generalist GUI Agent with Native Reasoning and Reflection
·2599 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Zhejiang University
InfiGUIAgent, a novel multimodal GUI agent, leverages a two-stage training pipeline to achieve advanced reasoning and GUI interaction capabilities, outperforming existing models in benchmarks.
OneKE: A Dockerized Schema-Guided LLM Agent-based Knowledge Extraction System
·379 words·2 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Information Extraction
🏢 Zhejiang University
OneKE: a dockerized, schema-guided LLM agent system efficiently extracts knowledge from diverse sources, offering adaptability and robust error handling.
Orient Anything: Learning Robust Object Orientation Estimation from Rendering 3D Models
·3014 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Zhejiang University
Orient Anything: Learning robust object orientation estimation directly from rendered 3D models, achieving state-of-the-art accuracy on real images.
Prompting Depth Anything for 4K Resolution Accurate Metric Depth Estimation
·4162 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Zhejiang University
Prompting unlocks 4K metric depth from low-cost LiDAR.
ZipAR: Accelerating Autoregressive Image Generation through Spatial Locality
·2050 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Zhejiang University
ZipAR accelerates autoregressive image generation by up to 91% through parallel decoding leveraging spatial locality in images, making high-resolution image generation significantly faster.
Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
·4118 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Zhejiang University
ScoreLiDAR: Distilling diffusion models for 5x faster, higher-quality 3D LiDAR scene completion!