3D Vision

Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos

26 September 2024·2037 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Department of Electrical Engineering, Linköping University

OSDCap: Online optimal-state dynamics estimation selectively incorporates physics models with kinematic observations to achieve highly accurate, physically-plausible human motion capture from videos.

OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding

26 September 2024·2396 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Peking University

OpenGaussian achieves 3D point-level open vocabulary understanding using 3D Gaussian Splatting by training 3D instance features with high 3D consistency, employing a two-level codebook for feature dis…

OpenDlign: Open-World Point Cloud Understanding with Depth-Aligned Images

26 September 2024·2441 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Imperial College London

OpenDlign uses novel depth-aligned images from a diffusion model to boost open-world 3D understanding, achieving significant performance gains on diverse benchmarks.

One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection

26 September 2024·2196 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

OneDet3D: A universal 3D object detector trained jointly on diverse indoor/outdoor datasets, achieving one-for-all performance across domains and categories.

ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splattings

26 September 2024·1959 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Dept. of ECE & ASRI

ODGS: Lightning-fast 3D scene reconstruction from single omnidirectional images using 3D Gaussian splatting, achieving 100x speedup over NeRF-based methods.

OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries

26 September 2024·2593 words·13 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 ShanghaiTech University

OctreeOcc uses octree queries for efficient and multi-granularity 3D occupancy prediction, surpassing state-of-the-art methods with reduced computational costs.

OccFusion: Rendering Occluded Humans with Generative Diffusion Priors

26 September 2024·2014 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Stanford University

OccFusion: High-fidelity human rendering from videos, even with occlusions, using 3D Gaussian splatting and 2D diffusion priors.

Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering

26 September 2024·2264 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Monash University

Normal-GS improves 3D Gaussian Splatting by integrating normal vectors into the rendering pipeline, achieving near state-of-the-art visual quality with accurate surface normals in real-time.

NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation

26 September 2024·2258 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 PhiGent Robotics

NeuroGauss4D-PCI masters complex point cloud interpolation using 4D neural fields and Gaussian deformation fields, achieving superior accuracy in dynamic scenes.

NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction

26 September 2024·2947 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Shanghai Jiao Tong University

NeuRodin: A two-stage neural framework achieves high-fidelity 3D surface reconstruction from posed RGB images by innovatively addressing limitations in SDF-based methods, resulting in superior reconst…

Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set

26 September 2024·2791 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

Neural SDF inference is revolutionized by dynamically aligning 3D Gaussians to a neural SDF’s zero-level set, enabling accurate, smooth 3D surface reconstruction.

Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses

26 September 2024·3744 words·18 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 KAIST

Learn disentangled 3D object poses and transfer them between different object identities using a novel neural pose representation, boosting 3D shape generation!

Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation

26 September 2024·2789 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Tübingen

Neural Localizer Fields (NLF) revolutionizes 3D human pose and shape estimation by learning a continuous field of point localizer functions, enabling flexible training on diverse data and on-the-fly p…

Neural Isometries: Taming Transformations for Equivariant ML

26 September 2024·2578 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 PlayStation

Neural Isometries learns a latent space where geometric relationships in the observation space are represented as isometries in the latent space, enabling efficient handling of complex symmetries and …

Neural Experts: Mixture of Experts for Implicit Neural Representations

26 September 2024·3441 words·17 mins· loading · loading

Computer Vision 3D Vision 🏢 Roblox

Boosting implicit neural representations, Neural Experts uses a Mixture of Experts architecture to achieve faster, more accurate, and memory-efficient signal reconstruction across various tasks.

NeuMA: Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

26 September 2024·2379 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

NeuMA: a novel neural material adaptor corrects existing physical models, accurately learning complex dynamics from visual observations.

MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

26 September 2024·1997 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Monash University

MVSplat360: Generating stunning 360° views from just a few images!

MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps

26 September 2024·2981 words·14 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 National University of Singapore

MVSDet uses efficient plane sweeps for accurate indoor 3D object detection from multiple images, significantly outperforming previous NeRF-based methods.

MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing

26 September 2024·3629 words·18 mins· loading · loading

Computer Vision 3D Vision 🏢 Fudan University

MVInpainter: Pose-free multi-view consistent inpainting bridges 2D and 3D editing by simplifying 3D editing to a multi-view 2D inpainting task.

MVGamba: Unify 3D Content Generation as State Space Sequence Modeling

26 September 2024·2497 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Nanyang Technological University

MVGamba: A unified, feed-forward 3D content generation model achieving state-of-the-art quality and speed using an RNN-like state space model for efficient multi-view Gaussian reconstruction.