3D Vision
Neural LightRig: Unlocking Accurate Object Normal and Material Estimation with Multi-Light Diffusion
·3868 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Chinese University of Hong Kong
Neural LightRig uses multi-light diffusion to accurately estimate object normals and materials from a single image, outperforming existing methods.
FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction
·4390 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tencent AI Lab
FreeSplatter: a novel feed-forward framework reconstructs high-quality 3D scenes from uncalibrated sparse-view images, estimating camera poses in seconds.
MIDI: Multi-Instance Diffusion for Single Image to 3D Scene Generation
·2260 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tsinghua University
MIDI: a novel multi-instance diffusion model generates compositional 3D scenes from single images by simultaneously creating multiple 3D instances with accurate spatial relationships and high generali…
Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
·4118 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Zhejiang University
ScoreLiDAR: Distilling diffusion models for 5x faster, higher-quality 3D LiDAR scene completion!
2DGS-Room: Seed-Guided 2D Gaussian Splatting with Geometric Constrains for High-Fidelity Indoor Scene Reconstruction
·2645 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tsinghua University
2DGS-Room: Seed-guided 2D Gaussian splatting with geometric constraints achieves state-of-the-art high-fidelity indoor scene reconstruction.
Structured 3D Latents for Scalable and Versatile 3D Generation
·4249 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tsinghua University
Unified 3D latent representation (SLAT) enables versatile high-quality 3D asset generation, significantly outperforming existing methods.
One Shot, One Talk: Whole-body Talking Avatar from a Single Image
·2297 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 University of Science and Technology of China
One-shot image to realistic, animatable talking avatar! Novel pipeline uses diffusion models and a hybrid 3DGS-mesh representation, achieving seamless generalization and precise control.
AlphaTablets: A Generic Plane Representation for 3D Planar Reconstruction from Monocular Videos
·2678 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tsinghua University
AlphaTablets: A novel 3D plane representation enabling accurate, consistent, and flexible 3D planar reconstruction from monocular videos, achieving state-of-the-art results.
Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters
·4458 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Tencent PCG
Make-It-Animatable: Instantly create animation-ready 3D characters, regardless of pose or shape, using a novel data-driven framework.
CAT4D: Create Anything in 4D with Multi-View Video Diffusion Models
·3896 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Google DeepMind
CAT4D: Create realistic 4D scenes from single-view videos using a novel multi-view video diffusion model.
MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation
·4827 words·23 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 DFKI
MARVEL-40M+ & MARVEL-FX3D: 40M+ high-quality 3D annotations & a fast two-stage text-to-3D pipeline enabling high-fidelity 3D model generation within 15 seconds.
SplatFlow: Multi-View Rectified Flow Model for 3D Gaussian Splatting Synthesis
·3638 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Twelve Labs
SplatFlow: A novel multi-view rectified flow model enabling direct 3D Gaussian splatting generation & training-free editing for diverse 3D tasks.
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
·2778 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Nanyang Technological University
SAR3D: Blazing-fast autoregressive 3D object generation and understanding using a multi-scale VQVAE, achieving sub-second generation and detailed multimodal comprehension.
Learning 3D Representations from Procedural 3D Programs
·4094 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 University of Virginia
Self-supervised learning of 3D representations from procedurally generated synthetic shapes achieves comparable performance to models trained on real-world datasets, highlighting the potential of synt…
Material Anything: Generating Materials for Any 3D Object via Diffusion
·4056 words·20 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Northwestern Polytechnical University
Material Anything: Generate realistic materials for ANY 3D object via diffusion!
Novel View Extrapolation with Video Diffusion Priors
·2381 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Nanyang Technological University
ViewExtrapolator leverages Stable Video Diffusion to realistically extrapolate novel views far beyond training data, dramatically improving the quality of 3D scene generation.
Generative World Explorer
·1739 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Johns Hopkins University
Generative World Explorer (Genex) enables agents to imaginatively explore environments, updating beliefs with generated observations for better decision-making.
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings
·3736 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Autodesk
WaLa: a billion-parameter 3D generative model using wavelet encodings achieves state-of-the-art results, generating high-quality 3D shapes in seconds.
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
·2630 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Peking University
GaussianAnything: Interactive point cloud latent diffusion enables high-quality, editable 3D models from images or text, overcoming existing 3D generation limitations.
SAMPart3D: Segment Any Part in 3D Objects
·3136 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 University of Hong Kong
SAMPart3D: Zero-shot 3D part segmentation across granularities, scaling to large datasets & handling part ambiguity.