Skip to main content

3D Vision

KMM: Key Frame Mask Mamba for Extended Motion Generation
·2527 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
KMM: Key Frame Mask Mamba generates extended, diverse human motion from text prompts by innovatively masking key frames in the Mamba architecture and using contrastive learning for improved text-motio…
DimensionX: Create Any 3D and 4D Scenes from a Single Image with Controllable Video Diffusion
·2263 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Tsinghua University
DimensionX generates photorealistic 3D and 4D scenes from a single image via controllable video diffusion, enabling precise manipulation of spatial structure and temporal dynamics.
GarVerseLOD: High-Fidelity 3D Garment Reconstruction from a Single In-the-Wild Image using a Dataset with Levels of Details
·5135 words·25 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 SSE, CUHKSZ, China
GarVerseLOD introduces a novel dataset and framework for high-fidelity 3D garment reconstruction from a single image, achieving unprecedented robustness via a hierarchical approach and leveraging a ma…
GenXD: Generating Any 3D and 4D Scenes
·2731 words·13 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 National University of Singapore
GenXD: A unified model generating high-quality 3D & 4D scenes from any number of images, advancing the field of dynamic scene generation.
DreamPolish: Domain Score Distillation With Progressive Geometry Generation
·2197 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 Peking University
DreamPolish: A new text-to-3D model generates highly detailed 3D objects with polished surfaces and realistic textures using progressive geometry refinement and a novel domain score distillation tech…
DELTA: Dense Efficient Long-range 3D Tracking for any video
·3706 words·18 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision 3D Vision 🏢 UMass Amherst
DELTA: A new method efficiently tracks every pixel in 3D space from monocular videos, enabling accurate motion estimation across entire videos with state-of-the-art accuracy and over 8x speed improvem…