Computer Vision

Neural Gaffer: Relighting Any Object via Diffusion

26 September 2024·2042 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 Cornell University

Neural Gaffer: Relighting any object via diffusion using a single image and an environment map to produce high-quality, realistic relit images.

Neural Experts: Mixture of Experts for Implicit Neural Representations

26 September 2024·3441 words·17 mins· loading · loading

Computer Vision 3D Vision 🏢 Roblox

Boosting implicit neural representations, Neural Experts uses a Mixture of Experts architecture to achieve faster, more accurate, and memory-efficient signal reconstruction across various tasks.

Neural Cover Selection for Image Steganography

26 September 2024·3814 words·18 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 University of Texas at Austin

This study introduces a neural cover selection framework for image steganography, optimizing latent spaces in generative models to improve message recovery and image quality.

Neural Concept Binder

26 September 2024·3025 words·15 mins· loading · loading

Computer Vision Visual Question Answering 🏢 Computer Science Department, TU Darmstadt

The Neural Concept Binder (NCB) framework learns expressive, inspectable, and revisable visual concepts unsupervised, integrating both continuous and discrete representations for seamless use in neura…

NeuMA: Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics

26 September 2024·2379 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University

NeuMA: a novel neural material adaptor corrects existing physical models, accurately learning complex dynamics from visual observations.

NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing

26 September 2024·2217 words·11 mins· loading · loading

Computer Vision Video Understanding 🏢 National Yang Ming Chiao Tung University

NaRCan: High-quality video editing via diffusion priors and hybrid deformation fields.

MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views

26 September 2024·1997 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Monash University

MVSplat360: Generating stunning 360° views from just a few images!

MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps

26 September 2024·2981 words·14 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 National University of Singapore

MVSDet uses efficient plane sweeps for accurate indoor 3D object detection from multiple images, significantly outperforming previous NeRF-based methods.

MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing

26 September 2024·3629 words·18 mins· loading · loading

Computer Vision 3D Vision 🏢 Fudan University

MVInpainter: Pose-free multi-view consistent inpainting bridges 2D and 3D editing by simplifying 3D editing to a multi-view 2D inpainting task.

MVGamba: Unify 3D Content Generation as State Space Sequence Modeling

26 September 2024·2497 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Nanyang Technological University

MVGamba: A unified, feed-forward 3D content generation model achieving state-of-the-art quality and speed using an RNN-like state space model for efficient multi-view Gaussian reconstruction.

MV2Cyl: Reconstructing 3D Extrusion Cylinders from Multi-View Images

26 September 2024·3293 words·16 mins· loading · loading

Computer Vision 3D Vision 🏢 Korea Advanced Institute of Science and Technology

MV2Cyl: A novel method reconstructs 3D extrusion cylinder CAD models directly from multi-view images, surpassing accuracy of methods using raw 3D geometry.

Multiview Scene Graph

26 September 2024·2365 words·12 mins· loading · loading

Computer Vision Scene Understanding 🏢 New York University

AI models struggle to understand 3D space like humans do. This paper introduces Multiview Scene Graphs (MSGs) – a new topological scene representation using interconnected place and object nodes buil…

Multistep Distillation of Diffusion Models via Moment Matching

26 September 2024·2156 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 Google DeepMind

New method distills slow diffusion models into fast, few-step models by matching data expectations, achieving state-of-the-art results on ImageNet.

MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step

26 September 2024·3626 words·18 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

MultiPull: a novel method reconstructing detailed 3D surfaces from raw point clouds using multi-step optimization of multi-level features, significantly improving accuracy and detail.

Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis

26 September 2024·2187 words·11 mins· loading · loading

Computer Vision Video Understanding 🏢 Xiangtan University

Multi-view Masked Contrastive Representation Learning (M²CRL) significantly boosts endoscopic video analysis by using a novel multi-view masking strategy and contrastive learning, achieving state-of-t…

Multi-times Monte Carlo Rendering for Inter-reflection Reconstruction

26 September 2024·1845 words·9 mins· loading · loading

Computer Vision 3D Vision 🏢 Shanghai Jiao Tong University

Ref-MC2 reconstructs high-fidelity 3D objects with inter-reflections by using a novel multi-times Monte Carlo sampling strategy, achieving superior performance in accuracy and efficiency.

Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model

26 September 2024·2122 words·10 mins· loading · loading

Computer Vision Image Classification 🏢 City University of Hong Kong

MSVMamba: A novel multi-scale vision model leveraging state-space models, achieves high accuracy in image classification and object detection while maintaining linear complexity, solving the long-rang…

Multi-scale Consistency for Robust 3D Registration via Hierarchical Sinkhorn Tree

26 September 2024·2306 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

Hierarchical Sinkhorn Tree (HST) robustly retrieves accurate 3D point cloud correspondences using multi-scale consistency, outperforming state-of-the-art methods.

Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images

26 September 2024·2520 words·12 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 KAIST

MHCDIFF: a novel pipeline using multi-hypotheses conditioned point cloud diffusion for accurate 3D human reconstruction from occluded images, outperforming state-of-the-art methods.

MTGS: A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction

26 September 2024·3398 words·16 mins· loading · loading

AI Generated Computer Vision Video Understanding 🏢 Idiap Research Institute

MTGS: a unified framework jointly predicts gaze and social gaze (shared attention, mutual gaze) for multiple people in videos, achieving state-of-the-art results using a temporal transformer model and…