Skip to main content

Computer Vision

Distribution Guidance Network for Weakly Supervised Point Cloud Semantic Segmentation
·2253 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 Peking University
DGNet enhances weakly supervised point cloud segmentation by aligning feature embeddings to a mixture of von Mises-Fisher distributions, achieving state-of-the-art performance.
DistillNeRF: Perceiving 3D Scenes from Single-Glance Images by Distilling Neural Fields and Foundation Model Features
·2827 words·14 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 NVIDIA Research
DistillNeRF: a self-supervised learning framework enabling accurate 3D scene reconstruction from sparse, single-frame images by cleverly distilling features from offline NeRFs and 2D foundation models…
Disentangled Style Domain for Implicit $z$-Watermark Towards Copyright Protection
·1999 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 Fudan University
This paper introduces a novel implicit Zero-Watermarking scheme using disentangled style domains to detect unauthorized dataset usage in text-to-image models, offering robust copyright protection via …
DisC-GS: Discontinuity-aware Gaussian Splatting
·2095 words·10 mins· loading · loading
Computer Vision 3D Vision 🏢 Lancaster University
DisC-GS enhances Gaussian Splatting for real-time novel view synthesis by accurately rendering image discontinuities and boundaries, improving visual quality.
Director3D: Real-world Camera Trajectory and 3D Scene Generation from Text
·2781 words·14 mins· loading · loading
AI Generated Computer Vision 3D Vision 🏢 Shanghai Artificial Intelligence Laboratory
Director3D generates realistic 3D scenes and camera trajectories from text descriptions using a three-stage pipeline: Cinematographer, Decorator, and Detailer.
Direct3D: Scalable Image-to-3D Generation via 3D Latent Diffusion Transformer
·2139 words·11 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Oxford
Direct3D: Revolutionizing image-to-3D generation with a scalable, native 3D diffusion model achieving state-of-the-art quality.
Direct Unlearning Optimization for Robust and Safe Text-to-Image Models
·4016 words·19 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 NAVER AI Lab
Direct Unlearning Optimization (DUO) robustly removes unsafe content from text-to-image models by using paired image data and output-preserving regularization, effectively defending against adversaria…
Direct Consistency Optimization for Robust Customization of Text-to-Image Diffusion models
·3011 words·15 mins· loading · loading
Computer Vision Image Generation 🏢 KAIST
Boosting personalized image generation! Direct Consistency Optimization (DCO) fine-tunes text-to-image models, ensuring subject consistency and prompt fidelity, even when merging separately customized…
DiPEx: Dispersing Prompt Expansion for Class-Agnostic Object Detection
·2876 words·14 mins· loading · loading
AI Generated Computer Vision Object Detection 🏢 University of Queensland
DiPEx: a novel self-supervised prompt expansion method dramatically boosts class-agnostic object detection by progressively learning non-overlapping hyperspherical prompts, surpassing existing methods…
DiP-GO: A Diffusion Pruner via Few-step Gradient Optimization
·2302 words·11 mins· loading · loading
Computer Vision Image Generation 🏢 Advanced Micro Devices, Inc.
DiP-GO: A novel pruning method accelerates diffusion models via few-step gradient optimization, achieving a 4.4x speedup on Stable Diffusion 1.5 without accuracy loss.
DINTR: Tracking via Diffusion-based Interpolation
·2223 words·11 mins· loading · loading
Computer Vision Object Detection 🏢 University of Arkansas
DINTR: A novel diffusion-based object tracker surpasses existing methods by using efficient interpolation, achieving superior performance across diverse benchmarks.
DiMSUM: Diffusion Mamba - A Scalable and Unified Spatial-Frequency Method for Image Generation
·3655 words·18 mins· loading · loading
Computer Vision Image Generation 🏢 VinAI Research
DiMSUM: A novel diffusion model boosts image generation by unifying spatial and frequency information, achieving superior results and faster training.
DiffusionFake: Enhancing Generalization in Deepfake Detection via Guided Stable Diffusion
·2705 words·13 mins· loading · loading
AI Generated Computer Vision Face Recognition 🏢 Tencent AI Lab
DiffusionFake enhances deepfake detection by cleverly reversing the image generation process, enabling detectors to learn more robust features and significantly improve cross-domain generalization.
DiffusionBlend: Learning 3D Image Prior through Position-aware Diffusion Score Blending for 3D Computed Tomography Reconstruction
·2570 words·13 mins· loading · loading
Computer Vision 3D Vision 🏢 University of Michigan
DiffusionBlend++ learns a 3D image prior via position-aware diffusion score blending, achieving state-of-the-art 3D CT reconstruction with superior efficiency.
Diffusion4D: Fast Spatial-temporal Consistent 4D generation via Video Diffusion Models
·1559 words·8 mins· loading · loading
Computer Vision Image Generation 🏢 University of Toronto
Diffusion4D: Fast, consistent 4D content generation via a novel 4D-aware video diffusion model, surpassing existing methods in efficiency and 4D geometry consistency.
Diffusion-based Layer-wise Semantic Reconstruction for Unsupervised Out-of-Distribution Detection
·2879 words·14 mins· loading · loading
Computer Vision Out-of-Distribution Detection 🏢 Xidian University
Unsupervised OOD detection gets a boost with a diffusion-based approach that leverages multi-layer semantic feature reconstruction for improved accuracy and speed.
DiffuLT: Diffusion for Long-tail Recognition Without External Knowledge
·2601 words·13 mins· loading · loading
Computer Vision Image Classification 🏢 National Key Laboratory for Novel Software Technology, Nanjing University
DiffuLT uses a novel diffusion model to generate balanced training data from imbalanced datasets, achieving state-of-the-art results in long-tailed image recognition without external knowledge.
DiffuBox: Refining 3D Object Detection with Point Diffusion
·3129 words·15 mins· loading · loading
Computer Vision 3D Vision 🏢 Cornell University
DiffuBox refines 3D object detection using a novel diffusion-based approach, significantly improving accuracy across various domains by refining bounding boxes based on surrounding LiDAR point clouds.
DiffCut: Catalyzing Zero-Shot Semantic Segmentation with Diffusion Features and Recursive Normalized Cut
·3253 words·16 mins· loading · loading
AI Generated Computer Vision Image Segmentation 🏢 Thales
DiffCut, a novel unsupervised zero-shot semantic segmentation method, leverages diffusion UNet features and recursive normalized cuts to achieve state-of-the-art performance.
DI-MaskDINO: A Joint Object Detection and Instance Segmentation Model
·1985 words·10 mins· loading · loading
Computer Vision Object Detection 🏢 Tsinghua University
DI-MaskDINO: Novel model significantly boosts object detection & instance segmentation accuracy by addressing performance imbalance using a De-Imbalance module and Balance-Aware Tokens Optimization.