Skip to main content

3D Vision

SAM-Guided Masked Token Prediction for 3D Scene Understanding
·1740 words·9 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Clemson University
This paper introduces SAM-guided masked token prediction, a novel framework for 3D scene understanding that leverages foundation models to significantly improve 3D object detection and semantic segmen…
SA3DIP: Segment Any 3D Instance with Potential 3D Priors
·1792 words·9 mins· loading · loading
3D Vision 🏢 Xidian University
SA3DIP boosts 3D instance segmentation accuracy by cleverly using 3D spatial and textural cues alongside 2D multi-view masks, overcoming limitations of previous methods.
RobIR: Robust Inverse Rendering for High-Illumination Scenes
·2339 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 Tencent AI Lab
RobIR: Robust inverse rendering in high-illumination scenes using ACES tone mapping and regularized visibility estimation for accurate BRDF reconstruction.
Rethinking 3D Convolution in $ll_p$-norm Space
·1754 words·9 mins· loading · loading
3D Vision 🏢 University of Chinese Academy of Sciences
L1-norm based 3D convolution achieves competitive performance with lower energy consumption and latency compared to traditional methods, as proven through universal approximation theorem and experimen…
ReGS: Reference-based Controllable Scene Stylization with Gaussian Splatting
·1952 words·10 mins· loading · loading
Computer Vision 3D Vision 🏢 Johns Hopkins University
ReGS: Real-time reference-based 3D scene stylization using Gaussian Splatting for high-fidelity texture editing and free-view navigation.
Reconstruction of Manipulated Garment with Guided Deformation Prior
·2931 words·14 mins· loading · loading
Computer Vision 3D Vision 🏢 Computer Vision Lab, EPFL
Researchers developed a novel method for reconstructing the 3D shape of manipulated garments, achieving superior accuracy compared to existing techniques, particularly for complex, non-rigid deformati…
QUEEN: QUantized Efficient ENcoding for Streaming Free-viewpoint Videos
·3903 words·19 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 University of Maryland
QUEEN: A novel framework for quantized and efficient streaming of free-viewpoint videos achieving high compression, quality, and speed.
ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Field
·1770 words·9 mins· loading · loading
Computer Vision 3D Vision 🏢 Stanford University
ProvNeRF enhances NeRF reconstruction by modeling per-point provenance as a stochastic field, improving novel view synthesis and uncertainty estimation, particularly in sparse, unconstrained view sett…
ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing
·1818 words·9 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Illinois Urbana-Champaign
ProEdit: High-quality 3D scene editing via progressive subtask decomposition.
PPLNs: Parametric Piecewise Linear Networks for Event-Based Temporal Modeling and Beyond
·1984 words·10 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Texas at Austin
Parametric Piecewise Linear Networks (PPLNs) achieve state-of-the-art results in event-based and frame-based computer vision tasks by mimicking biological neural principles.
Polyhedral Complex Derivation from Piecewise Trilinear Networks
·2972 words·14 mins· loading · loading
Computer Vision 3D Vision 🏢 NAVER AI Lab
This paper presents a novel method for analytically extracting meshes from neural implicit surface networks using trilinear interpolation, offering theoretical insights and practical efficiency.
PointMamba: A Simple State Space Model for Point Cloud Analysis
·2563 words·13 mins· loading · loading
Computer Vision 3D Vision 🏢 Huazhong University of Science & Technology
PointMamba: A linear-complexity state space model achieving superior performance in point cloud analysis, reducing computational cost significantly.
PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection
·5033 words·24 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Zhejiang University
PointAD: a novel zero-shot 3D anomaly detection method using CLIP’s strong generalization abilities to identify anomalies in unseen objects by transferring knowledge from both points and pixels.
Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis
·2874 words·14 mins· loading · loading
Computer Vision 3D Vision 🏢 Department of Computer Science, Renmin University of China
Point-PRC improves generalizable 3D point cloud analysis by regulating prompt learning to harmonize task-specific and general knowledge within large 3D models.
Physically Compatible 3D Object Modeling from a Single Image
·1864 words·9 mins· loading · loading
3D Vision 🏢 Massachusetts Institute of Technology
Single image to physically compatible 3D objects: A new framework ensures 3D models maintain stability and mirror real-world equilibrium states, advancing realism in dynamic simulations and 3D printi…
PhyRecon: Physically Plausible Neural Scene Reconstruction
·2451 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Tsinghua University
PHYRECON: A novel neural scene reconstruction method uses differentiable rendering and physics simulation for physically plausible 3D models.
Pedestrian-Centric 3D Pre-collision Pose and Shape Estimation from Dashcam Perspective
·2531 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Science and Technology Beijing
New Pedestrian-Vehicle Collision Pose dataset (PVCP) and Pose Estimation Network (PPSENet) improve pedestrian pre-collision pose estimation from dashcam video.
PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders
·2194 words·11 mins· loading · loading
3D Vision 🏢 Shanghai Jiao Tong University
PCP-MAE enhances point cloud self-supervised learning by cleverly predicting masked patch centers, leading to superior 3D object classification and scene segmentation.
PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding
·2469 words·12 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Bournemouth University
PCoTTA: A novel framework enables multi-task point cloud models to seamlessly adapt to continuously changing target domains during testing, overcoming catastrophic forgetting and error accumulation.
OPUS: Occupancy Prediction Using a Sparse Set
·2458 words·12 mins· loading · loading
Computer Vision 3D Vision 🏢 Nankai University
OPUS: a novel, real-time occupancy prediction framework using a sparse set prediction paradigm, outperforms state-of-the-art methods on Occ3D-nuScenes.