3D Vision

SAM-Guided Masked Token Prediction for 3D Scene Understanding

26 September 2024·1740 words·9 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Clemson University

This paper introduces SAM-guided masked token prediction, a novel framework for 3D scene understanding that leverages foundation models to significantly improve 3D object detection and semantic segmen…

SA3DIP: Segment Any 3D Instance with Potential 3D Priors

26 September 2024·1792 words·9 mins· loading · loading

3D Vision 🏢 Xidian University

SA3DIP boosts 3D instance segmentation accuracy by cleverly using 3D spatial and textural cues alongside 2D multi-view masks, overcoming limitations of previous methods.

RobIR: Robust Inverse Rendering for High-Illumination Scenes

26 September 2024·2339 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tencent AI Lab

RobIR: Robust inverse rendering in high-illumination scenes using ACES tone mapping and regularized visibility estimation for accurate BRDF reconstruction.

Rethinking 3D Convolution in $ll_p$-norm Space

26 September 2024·1754 words·9 mins· loading · loading

3D Vision 🏢 University of Chinese Academy of Sciences

L1-norm based 3D convolution achieves competitive performance with lower energy consumption and latency compared to traditional methods, as proven through universal approximation theorem and experimen…

ReGS: Reference-based Controllable Scene Stylization with Gaussian Splatting

26 September 2024·1952 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Johns Hopkins University

ReGS: Real-time reference-based 3D scene stylization using Gaussian Splatting for high-fidelity texture editing and free-view navigation.

Reconstruction of Manipulated Garment with Guided Deformation Prior

26 September 2024·2931 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Computer Vision Lab, EPFL

Researchers developed a novel method for reconstructing the 3D shape of manipulated garments, achieving superior accuracy compared to existing techniques, particularly for complex, non-rigid deformati…

QUEEN: QUantized Efficient ENcoding for Streaming Free-viewpoint Videos

26 September 2024·3903 words·19 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 University of Maryland

QUEEN: A novel framework for quantized and efficient streaming of free-viewpoint videos achieving high compression, quality, and speed.

ProvNeRF: Modeling per Point Provenance in NeRFs as a Stochastic Field

26 September 2024·1770 words·9 mins· loading · loading

Computer Vision 3D Vision 🏢 Stanford University

ProvNeRF enhances NeRF reconstruction by modeling per-point provenance as a stochastic field, improving novel view synthesis and uncertainty estimation, particularly in sparse, unconstrained view sett…

ProEdit: Simple Progression is All You Need for High-Quality 3D Scene Editing

26 September 2024·1818 words·9 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Illinois Urbana-Champaign

ProEdit: High-quality 3D scene editing via progressive subtask decomposition.

PPLNs: Parametric Piecewise Linear Networks for Event-Based Temporal Modeling and Beyond

26 September 2024·1984 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Texas at Austin

Parametric Piecewise Linear Networks (PPLNs) achieve state-of-the-art results in event-based and frame-based computer vision tasks by mimicking biological neural principles.

Polyhedral Complex Derivation from Piecewise Trilinear Networks

26 September 2024·2972 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 NAVER AI Lab

This paper presents a novel method for analytically extracting meshes from neural implicit surface networks using trilinear interpolation, offering theoretical insights and practical efficiency.

PointMamba: A Simple State Space Model for Point Cloud Analysis

26 September 2024·2563 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 Huazhong University of Science & Technology

PointMamba: A linear-complexity state space model achieving superior performance in point cloud analysis, reducing computational cost significantly.

PointAD: Comprehending 3D Anomalies from Points and Pixels for Zero-shot 3D Anomaly Detection

26 September 2024·5033 words·24 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Zhejiang University

PointAD: a novel zero-shot 3D anomaly detection method using CLIP’s strong generalization abilities to identify anomalies in unseen objects by transferring knowledge from both points and pixels.

Point-PRC: A Prompt Learning Based Regulation Framework for Generalizable Point Cloud Analysis

26 September 2024·2874 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Department of Computer Science, Renmin University of China

Point-PRC improves generalizable 3D point cloud analysis by regulating prompt learning to harmonize task-specific and general knowledge within large 3D models.

Physically Compatible 3D Object Modeling from a Single Image

26 September 2024·1864 words·9 mins· loading · loading

3D Vision 🏢 Massachusetts Institute of Technology

Single image to physically compatible 3D objects: A new framework ensures 3D models maintain stability and mirror real-world equilibrium states, advancing realism in dynamic simulations and 3D printi…

PhyRecon: Physically Plausible Neural Scene Reconstruction

26 September 2024·2451 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

PHYRECON: A novel neural scene reconstruction method uses differentiable rendering and physics simulation for physically plausible 3D models.

Pedestrian-Centric 3D Pre-collision Pose and Shape Estimation from Dashcam Perspective

26 September 2024·2531 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Science and Technology Beijing

New Pedestrian-Vehicle Collision Pose dataset (PVCP) and Pose Estimation Network (PPSENet) improve pedestrian pre-collision pose estimation from dashcam video.

PCP-MAE: Learning to Predict Centers for Point Masked Autoencoders

26 September 2024·2194 words·11 mins· loading · loading

3D Vision 🏢 Shanghai Jiao Tong University

PCP-MAE enhances point cloud self-supervised learning by cleverly predicting masked patch centers, leading to superior 3D object classification and scene segmentation.

PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding

26 September 2024·2469 words·12 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Bournemouth University

PCoTTA: A novel framework enables multi-task point cloud models to seamlessly adapt to continuously changing target domains during testing, overcoming catastrophic forgetting and error accumulation.

OPUS: Occupancy Prediction Using a Sparse Set

26 September 2024·2458 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Nankai University

OPUS: a novel, real-time occupancy prediction framework using a sparse set prediction paradigm, outperforms state-of-the-art methods on Occ3D-nuScenes.