Skip to main content

3D Vision

Material Anything: Generating Materials for Any 3D Object via Diffusion
·4056 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Northwestern Polytechnical University
Material Anything: Generate realistic materials for ANY 3D object via diffusion!
Novel View Extrapolation with Video Diffusion Priors
·2381 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Nanyang Technological University
ViewExtrapolator leverages Stable Video Diffusion to realistically extrapolate novel views far beyond training data, dramatically improving the quality of 3D scene generation.
Generative World Explorer
·1739 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Johns Hopkins University
Generative World Explorer (Genex) enables agents to imaginatively explore environments, updating beliefs with generated observations for better decision-making.
Wavelet Latent Diffusion (Wala): Billion-Parameter 3D Generative Model with Compact Wavelet Encodings
·3736 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Autodesk
WaLa: a billion-parameter 3D generative model using wavelet encodings achieves state-of-the-art results, generating high-quality 3D shapes in seconds.
GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
·2630 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
GaussianAnything: Interactive point cloud latent diffusion enables high-quality, editable 3D models from images or text, overcoming existing 3D generation limitations.
SAMPart3D: Segment Any Part in 3D Objects
·3136 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 University of Hong Kong
SAMPart3D: Zero-shot 3D part segmentation across granularities, scaling to large datasets & handling part ambiguity.
KMM: Key Frame Mask Mamba for Extended Motion Generation
·2527 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
KMM: Key Frame Mask Mamba generates extended, diverse human motion from text prompts by innovatively masking key frames in the Mamba architecture and using contrastive learning for improved text-motio…
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
·2263 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Tsinghua University
DimensionX generates photorealistic 3D and 4D scenes from a single image via controllable video diffusion, enabling precise manipulation of spatial structure and temporal dynamics.
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details
·5135 words·25 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 SSE, CUHKSZ, China
GarVerseLOD introduces a novel dataset and framework for high-fidelity 3D garment reconstruction from a single image, achieving unprecedented robustness via a hierarchical approach and leveraging a ma…
GenXD: Generating Any 3D and 4D Scenes
·2731 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore
GenXD: A unified model generating high-quality 3D & 4D scenes from any number of images, advancing the field of dynamic scene generation.
DreamPolish: Domain Score Distillation With Progressive Geometry Generation
·2197 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
DreamPolish: A new text-to-3D model generates highly detailed 3D objects with polished surfaces and realistic textures using progressive geometry refinement and a novel domain score distillation tech…
DELTA: Dense Efficient Long-range 3D Tracking for any video
·3706 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 UMass Amherst
DELTA: A new method efficiently tracks every pixel in 3D space from monocular videos, enabling accurate motion estimation across entire videos with state-of-the-art accuracy and over 8x speed improvem…