3D Vision
Optimal-state Dynamics Estimation for Physics-based Human Motion Capture from Videos
·2037 words·10 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Department of Electrical Engineering, Linköping University
OSDCap: Online optimal-state dynamics estimation selectively incorporates physics models with kinematic observations to achieve highly accurate, physically-plausible human motion capture from videos.
OpenGaussian: Towards Point-Level 3D Gaussian-based Open Vocabulary Understanding
·2396 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Peking University
OpenGaussian achieves 3D point-level open vocabulary understanding using 3D Gaussian Splatting by training 3D instance features with high 3D consistency, employing a two-level codebook for feature dis…
OpenDlign: Open-World Point Cloud Understanding with Depth-Aligned Images
·2441 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Imperial College London
OpenDlign uses novel depth-aligned images from a diffusion model to boost open-world 3D understanding, achieving significant performance gains on diverse benchmarks.
One for All: Multi-Domain Joint Training for Point Cloud Based 3D Object Detection
·2196 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
OneDet3D: A universal 3D object detector trained jointly on diverse indoor/outdoor datasets, achieving one-for-all performance across domains and categories.
ODGS: 3D Scene Reconstruction from Omnidirectional Images with 3D Gaussian Splattings
·1959 words·10 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Dept. of ECE & ASRI
ODGS: Lightning-fast 3D scene reconstruction from single omnidirectional images using 3D Gaussian splatting, achieving 100x speedup over NeRF-based methods.
OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries
·2593 words·13 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 ShanghaiTech University
OctreeOcc uses octree queries for efficient and multi-granularity 3D occupancy prediction, surpassing state-of-the-art methods with reduced computational costs.
OccFusion: Rendering Occluded Humans with Generative Diffusion Priors
·2014 words·10 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Stanford University
OccFusion: High-fidelity human rendering from videos, even with occlusions, using 3D Gaussian splatting and 2D diffusion priors.
Normal-GS: 3D Gaussian Splatting with Normal-Involved Rendering
·2264 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Monash University
Normal-GS improves 3D Gaussian Splatting by integrating normal vectors into the rendering pipeline, achieving near state-of-the-art visual quality with accurate surface normals in real-time.
NeuroGauss4D-PCI: 4D Neural Fields and Gaussian Deformation Fields for Point Cloud Interpolation
·2258 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 PhiGent Robotics
NeuroGauss4D-PCI masters complex point cloud interpolation using 4D neural fields and Gaussian deformation fields, achieving superior accuracy in dynamic scenes.
NeuRodin: A Two-stage Framework for High-Fidelity Neural Surface Reconstruction
·2947 words·14 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Shanghai Jiao Tong University
NeuRodin: A two-stage neural framework achieves high-fidelity 3D surface reconstruction from posed RGB images by innovatively addressing limitations in SDF-based methods, resulting in superior reconst…
Neural Signed Distance Function Inference through Splatting 3D Gaussians Pulled on Zero-Level Set
·2791 words·14 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
Neural SDF inference is revolutionized by dynamically aligning 3D Gaussians to a neural SDF’s zero-level set, enabling accurate, smooth 3D surface reconstruction.
Neural Pose Representation Learning for Generating and Transferring Non-Rigid Object Poses
·3744 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 KAIST
Learn disentangled 3D object poses and transfer them between different object identities using a novel neural pose representation, boosting 3D shape generation!
Neural Localizer Fields for Continuous 3D Human Pose and Shape Estimation
·2789 words·14 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Tübingen
Neural Localizer Fields (NLF) revolutionizes 3D human pose and shape estimation by learning a continuous field of point localizer functions, enabling flexible training on diverse data and on-the-fly p…
Neural Isometries: Taming Transformations for Equivariant ML
·2578 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 PlayStation
Neural Isometries learns a latent space where geometric relationships in the observation space are represented as isometries in the latent space, enabling efficient handling of complex symmetries and …
Neural Experts: Mixture of Experts for Implicit Neural Representations
·3441 words·17 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Roblox
Boosting implicit neural representations, Neural Experts uses a Mixture of Experts architecture to achieve faster, more accurate, and memory-efficient signal reconstruction across various tasks.
NeuMA: Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
·2379 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
NeuMA: a novel neural material adaptor corrects existing physical models, accurately learning complex dynamics from visual observations.
MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
·1997 words·10 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Monash University
MVSplat360: Generating stunning 360° views from just a few images!
MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps
·2981 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 National University of Singapore
MVSDet uses efficient plane sweeps for accurate indoor 3D object detection from multiple images, significantly outperforming previous NeRF-based methods.
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
·3629 words·18 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Fudan University
MVInpainter: Pose-free multi-view consistent inpainting bridges 2D and 3D editing by simplifying 3D editing to a multi-view 2D inpainting task.
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
·2497 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Nanyang Technological University
MVGamba: A unified, feed-forward 3D content generation model achieving state-of-the-art quality and speed using an RNN-like state space model for efficient multi-view Gaussian reconstruction.