Computer Vision

Taming Generative Diffusion Prior for Universal Blind Image Restoration

26 September 2024·4450 words·21 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Fudan University

BIR-D tames generative diffusion models for universal blind image restoration, dynamically updating parameters to handle various complex degradations without assuming degradation model types.

Taming Diffusion Prior for Image Super-Resolution with Domain Shift SDEs

26 September 2024·2142 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 Advanced Micro Devices Inc.

DoSSR: A novel SR model boosts efficiency by 5-7x, achieving state-of-the-art performance with only 5 sampling steps by cleverly integrating a domain shift equation into pretrained diffusion models.

SyncVIS: Synchronized Video Instance Segmentation

26 September 2024·2160 words·11 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Hong Kong

SyncVIS: A new framework for video instance segmentation achieves state-of-the-art results by synchronously modeling video and frame-level information, overcoming limitations of asynchronous approache…

SyncTweedies: A General Generative Framework Based on Synchronized Diffusions

26 September 2024·4065 words·20 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 KAIST

SyncTweedies: a zero-shot diffusion synchronization framework generates diverse visual content (images, panoramas, 3D textures) by synchronizing multiple diffusion processes without fine-tuning, demon…

Suppress Content Shift: Better Diffusion Features via Off-the-Shelf Generation Techniques

26 September 2024·3213 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 Institute of Information Engineering, CAS

Boosting diffusion model features: This paper introduces GATE, a novel method to suppress ‘content shift’ in diffusion features, improving their quality via off-the-shelf generation techniques.

SuperVLAD: Compact and Robust Image Descriptors for Visual Place Recognition

26 September 2024·3456 words·17 mins· loading · loading

AI Generated Computer Vision Visual Place Recognition 🏢 Tsinghua University

SuperVLAD: A new visual place recognition method boasts superior robustness and compactness, outperforming state-of-the-art techniques by significantly reducing parameters and dimensions.

Subsurface Scattering for Gaussian Splatting

26 September 2024·2275 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Tübingen

Real-time rendering of objects with subsurface scattering effects is now possible with SSS-GS, a novel method combining explicit surface geometry and implicit subsurface scattering for high-quality no…

Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning

26 September 2024·3674 words·18 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 Google Research

Structured Unrestricted-Rank Matrices (SURMs) revolutionize parameter-efficient fine-tuning by offering greater flexibility and accuracy than existing methods like LoRA, achieving significant gains in…

StreamFlow: Streamlined Multi-Frame Optical Flow Estimation for Video Sequences

26 September 2024·2803 words·14 mins· loading · loading

AI Generated Computer Vision Video Understanding 🏢 Peking University

StreamFlow accelerates video optical flow estimation by 44% via a streamlined in-batch multi-frame pipeline and innovative spatiotemporal modeling, achieving state-of-the-art results.

STONE: A Submodular Optimization Framework for Active 3D Object Detection

26 September 2024·2151 words·11 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 University of Texas at Dallas

STONE: A novel submodular optimization framework drastically cuts 3D object detection training costs by cleverly selecting the most informative LiDAR point cloud data for labeling, achieving state-of-…

StepbaQ: Stepping backward as Correction for Quantized Diffusion Models

26 September 2024·2381 words·12 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 MediaTek

StepbaQ enhances quantized diffusion models by correcting accumulated quantization errors via a novel sampling step correction mechanism, significantly improving model accuracy without modifying exist…

START: A Generalized State Space Model with Saliency-Driven Token-Aware Transformation

26 September 2024·2428 words·12 mins· loading · loading

Computer Vision Domain Generalization 🏢 Nanjing University

START, a novel SSM-based architecture with saliency-driven token-aware transformation, achieves state-of-the-art domain generalization performance with efficient linear complexity.

Stable-Pose: Leveraging Transformers for Pose-Guided Text-to-Image Generation

26 September 2024·3451 words·17 mins· loading · loading

Computer Vision Image Generation 🏢 Munich Center for Machine Learning

Stable-Pose: Precise human pose guidance for text-to-image synthesis.

Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective

26 September 2024·2011 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 School of Data Science, University of Science and Technology of China

DiGIT stabilizes image autoregressive models’ latent space using a novel discrete tokenizer from self-supervised learning, achieving state-of-the-art image generation.

Stability and Generalizability in SDE Diffusion Models with Measure-Preserving Dynamics

26 September 2024·2499 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 University of Oxford

D³GM, a novel score-based diffusion model, enhances stability & generalizability in solving inverse problems by leveraging measure-preserving dynamics, enabling robust image reconstruction across dive…

SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening

26 September 2024·2088 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 University of Electronic Science and Technology of China

SSDiff: A novel spatial-spectral integrated diffusion model for superior remote sensing pansharpening.

SSA-Seg: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation

26 September 2024·2332 words·11 mins· loading · loading

Computer Vision Image Segmentation 🏢 Huawei Noah's Ark Lab Zhejiang University

SSA-Seg improves semantic segmentation by adapting pixel-level classifiers to the test image’s semantic and spatial features, achieving state-of-the-art performance with minimal extra computational co…

SplitNeRF: Split Sum Approximation Neural Field for Joint Geometry, Illumination, and Material Estimation

26 September 2024·5201 words·25 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 King Abdullah University of Science and Technology

SplitNeRF: One-hour training on a single GPU yields state-of-the-art scene geometry, lighting, and material property estimation!

Splatter a Video: Video Gaussian Representation for Versatile Processing

26 September 2024·2610 words·13 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Hong Kong

Researchers introduce Video Gaussian Representation (VGR) for versatile video processing, embedding videos into explicit 3D Gaussians for intuitive motion and appearance modeling.

Spiking Transformer with Experts Mixture

26 September 2024·2017 words·10 mins· loading · loading

Computer Vision Image Classification 🏢 Peking University

Spiking Experts Mixture Mechanism (SEMM) boosts Spiking Transformers by integrating Mixture-of-Experts for efficient, sparse conditional computation, achieving significant performance improvements on …