Computer Vision

Generalizable and Animatable Gaussian Head Avatar

26 September 2024·3445 words·17 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Tokyo

One-shot animatable head avatar reconstruction is achieved using a novel dual-lifting method that generates 3D Gaussians from a single image, enabling real-time expression control and rendering with s…

General Articulated Objects Manipulation in Real Images via Part-Aware Diffusion Process

26 September 2024·2623 words·13 mins· loading · loading

Computer Vision Image Generation 🏢 Shanghai Jiao Tong University

Part-Aware Diffusion Model (PA-Diffusion) enables precise and efficient manipulation of articulated objects in real images by using abstract 3D models and dynamic feature maps, overcoming limitations …

GaussianMarker: Uncertainty-Aware Copyright Protection of 3D Gaussian Splatting

26 September 2024·2093 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 NVIDIA Research

GaussianMarker: A novel uncertainty-aware watermarking method ensures robust copyright protection for 3D Gaussian Splatting assets, invisibly embedding messages into model parameters and extractable …

GaussianCut: Interactive segmentation via graph cut for 3D Gaussian Splatting

26 September 2024·3939 words·19 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 University of Toronto

GaussianCut enables intuitive 3D object selection via graph cuts on 3D Gaussian splatting, achieving competitive segmentation without extra training.

GaussianCube: A Structured and Explicit Radiance Representation for 3D Generative Modeling

26 September 2024·2946 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

GaussianCube revolutionizes 3D generative modeling with a structured, explicit radiance representation, achieving state-of-the-art results using significantly fewer parameters.

Gaussian Graph Network: Learning Efficient and Generalizable Gaussian Representations from Multi-view Images

26 September 2024·2277 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

Gaussian Graph Network (GGN) revolutionizes novel view synthesis by efficiently generating generalizable Gaussian representations from multi-view images, achieving superior rendering quality with fewe…

Fully Explicit Dynamic Gaussian Splatting

26 September 2024·3268 words·16 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 School of Electrical Engineering and Computer Science

Ex4DGS achieves real-time high-quality dynamic scene rendering using explicit 4D Gaussian representations and keyframe interpolation.

Full-Distance Evasion of Pedestrian Detectors in the Physical World

26 September 2024·2691 words·13 mins· loading · loading

Computer Vision Object Detection 🏢 Tsinghua University

Researchers developed Full Distance Attack (FDA) to generate adversarial patterns effective against pedestrian detectors across all distances, resolving the appearance gap issue between simulated and …

Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models

26 September 2024·2491 words·12 mins· loading · loading

Computer Vision Object Detection 🏢 School of Computer Science and Engineering, Sun Yat-Sen University

Frozen-DETR boosts object detection accuracy by integrating frozen foundation models as feature enhancers, achieving significant performance gains without the computational cost of fine-tuning.

From Trojan Horses to Castle Walls: Unveiling Bilateral Data Poisoning Effects in Diffusion Models

26 September 2024·3334 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 Tsinghua University

Diffusion models, while excelling in image generation, are vulnerable to data poisoning. This paper demonstrates a BadNets-like attack’s effectiveness against diffusion models, causing image misalign…

From Transparent to Opaque: Rethinking Neural Implicit Surfaces with $lpha$-NeuS

26 September 2024·1946 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Key Laboratory of System Software (CAS) and State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences

α-NeuS: A novel method for neural implicit surface reconstruction that accurately reconstructs both transparent and opaque objects simultaneously by leveraging the unique properties of distance fields…

From Chaos to Clarity: 3DGS in the Dark

26 September 2024·2516 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Nanyang Technology University

Researchers developed a self-supervised learning framework to create high-dynamic-range 3D Gaussian Splatting (3DGS) models from noisy raw images, significantly improving reconstruction quality and sp…

From an Image to a Scene: Learning to Imagine the World from a Million 360° Videos

26 September 2024·2541 words·12 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 University of Washington

ODIN, trained on a million 360° videos (360-1M), generates realistic novel views and reconstructs 3D scenes from single images.

FreqMark: Invisible Image Watermarking via Frequency Based Optimization in Latent Space

26 September 2024·3551 words·17 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Science and Technology of China

FreqMark: Robust invisible image watermarking via latent frequency space optimization, resisting regeneration attacks and achieving >90% bit accuracy with high image quality.

FreqBlender: Enhancing DeepFake Detection by Blending Frequency Knowledge

26 September 2024·3125 words·15 mins· loading · loading

AI Generated Computer Vision Face Recognition 🏢 Ocean University of China

FreqBlender enhances DeepFake detection by cleverly blending frequency domain knowledge of real and fake faces, improving model generalization and providing a complementary strategy to existing spatia…

FreeSplat: Generalizable 3D Gaussian Splatting Towards Free View Synthesis of Indoor Scenes

26 September 2024·2183 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 National University of Singapore

FreeSplat achieves state-of-the-art novel view synthesis by accurately localizing 3D Gaussians from long image sequences, overcoming limitations of prior methods confined to narrow-range interpolation…

FreeLong: Training-Free Long Video Generation with SpectralBlend Temporal Attention

26 September 2024·1815 words·9 mins· loading · loading

Computer Vision Video Understanding 🏢 Zhejiang University

FreeLong: Generate high-fidelity long videos without retraining using spectral blending of global and local video features!

Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion

26 September 2024·2351 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 University of Electronic Science and Technology of China

FeINFN: a novel Fourier-enhanced Implicit Neural Fusion Network, achieves state-of-the-art hyperspectral image fusion by innovatively combining spatial and frequency information in both the spatial an…

Fourier Amplitude and Correlation Loss: Beyond Using L2 Loss for Skillful Precipitation Nowcasting

26 September 2024·3838 words·19 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Hong Kong University of Science and Technology

This work proposes FACL, a novel loss function for precipitation nowcasting, improving forecast sharpness and meteorological skill without sacrificing accuracy.

FouRA: Fourier Low-Rank Adaptation

26 September 2024·5441 words·26 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Qualcomm AI Research

FouRA: a novel low-rank adaptation method improves text-to-image generation by learning projections in the Fourier domain and using an adaptive rank selection strategy, addressing LoRA’s limitations o…