Skip to main content

3D Vision

ZOPP: A Framework of Zero-shot Offboard Panoptic Perception for Autonomous Driving
·2348 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Multimedia Laboratory, the Chinese University of Hong Kong
ZOPP: A groundbreaking framework for zero-shot offboard panoptic perception in autonomous driving, enabling high-quality 3D scene understanding without human labeling.
Zero-Shot Scene Reconstruction from Single Images with Deep Prior Assembly
·2753 words·13 mins· loading · loading
Computer Vision 3D Vision 🏢 Tsinghua University
Zero-shot 3D scene reconstruction from single images is achieved by assembling diverse deep priors from large models, eliminating the need for 3D/2D training data and achieving superior performance.
Zero-Shot Event-Intensity Asymmetric Stereo via Visual Prompting from Image Domain
·4096 words·20 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Peking University
Zero-shot Event-Intensity Asymmetric Stereo (ZEST) uses visual prompting and monocular cues to achieve robust 3D perception without event-specific training, outperforming existing methods.
X-Ray: A Sequential 3D Representation For Generation
·2206 words·11 mins· loading · loading
3D Vision 🏢 National University of Singapore
X-Ray: A novel 3D representation generating complete object surfaces from a single image!
WildGaussians: 3D Gaussian Splatting In the Wild
·2601 words·13 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 ETH Zurich
WildGaussians enhances 3D Gaussian splatting for real-time rendering of photorealistic 3D scenes from in-the-wild images featuring occlusions and appearance changes.
Wild-GS: Real-Time Novel View Synthesis from Unconstrained Photo Collections
·1766 words·9 mins· loading · loading
Computer Vision 3D Vision 🏢 Johns Hopkins University
Wild-GS achieves real-time novel view synthesis from unconstrained photos by efficiently adapting 3D Gaussian Splatting, significantly improving speed and quality over existing methods.
VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
·2230 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), CASIA
VQ-Map leverages vector quantization to estimate bird’s-eye-view maps with unprecedented accuracy, setting new benchmarks.
Voxel Proposal Network via Multi-Frame Knowledge Distillation for Semantic Scene Completion
·2307 words·11 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Tianjin University
VPNet, a novel semantic scene completion network, uses multi-frame knowledge distillation and confident voxel proposals to improve accuracy and handle dynamic aspects of 3D scenes from point clouds, a…
Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection
·2040 words·10 mins· loading · loading
3D Vision 🏢 Hong Kong Polytechnic University
Voxel Mamba: a group-free 3D object detection method using state space models, achieving higher accuracy and efficiency by overcoming limitations of serialization-based Transformers.
Vision Foundation Model Enables Generalizable Object Pose Estimation
·3435 words·17 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Chinese University of Hong Kong
VFM-6D: a novel framework achieving generalizable object pose estimation for unseen categories by leveraging vision-language models.
VCR-GauS: View Consistent Depth-Normal Regularizer for Gaussian Surface Reconstruction
·2586 words·13 mins· loading · loading
Computer Vision 3D Vision 🏢 National University of Singapore
VCR-GauS: Novel view-consistent depth-normal regularizer for superior, real-time 3D surface reconstruction using Gaussian splatting.
Variational Multi-scale Representation for Estimating Uncertainty in 3D Gaussian Splatting
·2343 words·11 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Hong Kong Baptist University
New uncertainty estimation method for 3D Gaussian Splatting improves scene reconstruction quality by leveraging variational multi-scale representation and efficiently removing noisy data.
UV-free Texture Generation with Denoising and Geodesic Heat Diffusion
·2448 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Imperial College London
UV3-TeD generates high-quality 3D textures directly on object surfaces using a novel diffusion probabilistic model, eliminating UV-mapping limitations.
Unlearnable 3D Point Clouds: Class-wise Transformation Is All You Need
·3088 words·15 mins· loading · loading
Computer Vision 3D Vision 🏢 Huazhong University of Science and Technology
New unlearnable framework secures 3D point cloud data by using class-wise transformations, enabling authorized training while preventing unauthorized access.
UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections
·3538 words·17 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 ETH Zurich
UniSDF: Unifying neural representations reconstructs complex scenes with reflections, achieving state-of-the-art performance by blending camera and reflected view radiance fields.
Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
·2641 words·13 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Tsinghua University
Unique3D: Single image to high-fidelity 3D mesh in 30 seconds!
Unified Domain Generalization and Adaptation for Multi-View 3D Object Detection
·3024 words·15 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Korea University
Unified Domain Generalization and Adaptation (UDGA) tackles 3D object detection’s domain adaptation challenges by leveraging multi-view overlap and label-efficient learning, achieving state-of-the-art…
UniDSeg: Unified Cross-Domain 3D Semantic Segmentation via Visual Foundation Models Prior
·3219 words·16 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Xiamen University
UniDSeg uses Visual Foundation Models to create a unified framework for adaptable and generalizable cross-domain 3D semantic segmentation, achieving state-of-the-art results.
Training an Open-Vocabulary Monocular 3D Detection Model without 3D Data
·3285 words·16 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Tsinghua University
Train open-vocabulary 3D object detectors using only RGB images and large language models, achieving state-of-the-art performance without expensive LiDAR data.
Towards Learning Group-Equivariant Features for Domain Adaptive 3D Detection
·1931 words·10 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Oxford
GroupEXP-DA boosts domain adaptive 3D object detection by using a grouping-exploration strategy to reduce bias in pseudo-label collection and account for multiple factors affecting object perception i…