Skip to main content

3D Vision

Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings
·3736 words·18 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Autodesk
WaLa: a billion-parameter 3D generative model using wavelet encodings achieves state-of-the-art results, generating high-quality 3D shapes in seconds.
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
·2630 words·13 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
GaussianAnything: Interactive point cloud latent diffusion enables high-quality, editable 3D models from images or text, overcoming existing 3D generation limitations.
SAMPart3D: Segment Any Part in 3D Objects
·3136 words·15 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 University of Hong Kong
SAMPart3D: Zero-shot 3D part segmentation across granularities, scaling to large datasets & handling part ambiguity.
KMM: Key Frame Mask Mamba for Extended Motion Generation
·2527 words·12 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
KMM: Key Frame Mask Mamba generates extended, diverse human motion from text prompts by innovatively masking key frames in the Mamba architecture and using contrastive learning for improved text-motio…
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
·2263 words·11 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Tsinghua University
DimensionX generates photorealistic 3D and 4D scenes from a single image via controllable video diffusion, enabling precise manipulation of spatial structure and temporal dynamics.
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details
·5135 words·25 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 SSE, CUHKSZ, China
GarVerseLOD introduces a novel dataset and framework for high-fidelity 3D garment reconstruction from a single image, achieving unprecedented robustness via a hierarchical approach and leveraging a ma…
GenXD: Generating Any 3D and 4D Scenes
·2731 words·13 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore
GenXD: A unified model generating high-quality 3D & 4D scenes from any number of images, advancing the field of dynamic scene generation.
DreamPolish: Domain Score Distillation With Progressive Geometry Generation
·2197 words·11 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
DreamPolish: A new text-to-3D model generates highly detailed 3D objects with polished surfaces and realistic textures using progressive geometry refinement and a novel domain score distillation tech…
DELTA: Dense Efficient Long-range 3D Tracking for any video
·3706 words·18 mins
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 UMass Amherst
DELTA: A new method efficiently tracks every pixel in 3D space from monocular videos, enabling accurate motion estimation across entire videos with state-of-the-art accuracy and over 8x speed improvem…