Computer Vision
Neural Gaffer: Relighting Any Object via Diffusion
·2042 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Cornell University
Neural Gaffer: Relighting any object via diffusion using a single image and an environment map to produce high-quality, realistic relit images.
Neural Experts: Mixture of Experts for Implicit Neural Representations
·3441 words·17 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Roblox
Boosting implicit neural representations, Neural Experts uses a Mixture of Experts architecture to achieve faster, more accurate, and memory-efficient signal reconstruction across various tasks.
Neural Cover Selection for Image Steganography
·3814 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 University of Texas at Austin
This study introduces a neural cover selection framework for image steganography, optimizing latent spaces in generative models to improve message recovery and image quality.
Neural Concept Binder
·3025 words·15 mins·
loading
·
loading
Computer Vision
Visual Question Answering
🏢 Computer Science Department, TU Darmstadt
The Neural Concept Binder (NCB) framework learns expressive, inspectable, and revisable visual concepts unsupervised, integrating both continuous and discrete representations for seamless use in neura…
NeuMA: Neural Material Adaptor for Visual Grounding of Intrinsic Dynamics
·2379 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University
NeuMA: a novel neural material adaptor corrects existing physical models, accurately learning complex dynamics from visual observations.
NaRCan: Natural Refined Canonical Image with Integration of Diffusion Prior for Video Editing
·2217 words·11 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 National Yang Ming Chiao Tung University
NaRCan: High-quality video editing via diffusion priors and hybrid deformation fields.
MVSplat360: Feed-Forward 360 Scene Synthesis from Sparse Views
·1997 words·10 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Monash University
MVSplat360: Generating stunning 360° views from just a few images!
MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps
·2981 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 National University of Singapore
MVSDet uses efficient plane sweeps for accurate indoor 3D object detection from multiple images, significantly outperforming previous NeRF-based methods.
MVInpainter: Learning Multi-View Consistent Inpainting to Bridge 2D and 3D Editing
·3629 words·18 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Fudan University
MVInpainter: Pose-free multi-view consistent inpainting bridges 2D and 3D editing by simplifying 3D editing to a multi-view 2D inpainting task.
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling
·2497 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Nanyang Technological University
MVGamba: A unified, feed-forward 3D content generation model achieving state-of-the-art quality and speed using an RNN-like state space model for efficient multi-view Gaussian reconstruction.
MV2Cyl: Reconstructing 3D Extrusion Cylinders from Multi-View Images
·3293 words·16 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Korea Advanced Institute of Science and Technology
MV2Cyl: A novel method reconstructs 3D extrusion cylinder CAD models directly from multi-view images, surpassing accuracy of methods using raw 3D geometry.
Multiview Scene Graph
·2365 words·12 mins·
loading
·
loading
Computer Vision
Scene Understanding
🏢 New York University
AI models struggle to understand 3D space like humans do. This paper introduces Multiview Scene Graphs (MSGs) – a new topological scene representation using interconnected place and object nodes buil…
Multistep Distillation of Diffusion Models via Moment Matching
·2156 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Google DeepMind
New method distills slow diffusion models into fast, few-step models by matching data expectations, achieving state-of-the-art results on ImageNet.
MultiPull: Detailing Signed Distance Functions by Pulling Multi-Level Queries at Multi-Step
·3626 words·18 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
MultiPull: a novel method reconstructing detailed 3D surfaces from raw point clouds using multi-step optimization of multi-level features, significantly improving accuracy and detail.
Multi-view Masked Contrastive Representation Learning for Endoscopic Video Analysis
·2187 words·11 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 Xiangtan University
Multi-view Masked Contrastive Representation Learning (M²CRL) significantly boosts endoscopic video analysis by using a novel multi-view masking strategy and contrastive learning, achieving state-of-t…
Multi-times Monte Carlo Rendering for Inter-reflection Reconstruction
·1845 words·9 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Shanghai Jiao Tong University
Ref-MC2 reconstructs high-fidelity 3D objects with inter-reflections by using a novel multi-times Monte Carlo sampling strategy, achieving superior performance in accuracy and efficiency.
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model
·2122 words·10 mins·
loading
·
loading
Computer Vision
Image Classification
🏢 City University of Hong Kong
MSVMamba: A novel multi-scale vision model leveraging state-space models, achieves high accuracy in image classification and object detection while maintaining linear complexity, solving the long-rang…
Multi-scale Consistency for Robust 3D Registration via Hierarchical Sinkhorn Tree
·2306 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Tsinghua University
Hierarchical Sinkhorn Tree (HST) robustly retrieves accurate 3D point cloud correspondences using multi-scale consistency, outperforming state-of-the-art methods.
Multi-hypotheses Conditioned Point Cloud Diffusion for 3D Human Reconstruction from Occluded Images
·2520 words·12 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 KAIST
MHCDIFF: a novel pipeline using multi-hypotheses conditioned point cloud diffusion for accurate 3D human reconstruction from occluded images, outperforming state-of-the-art methods.
MTGS: A Novel Framework for Multi-Person Temporal Gaze Following and Social Gaze Prediction
·3398 words·16 mins·
loading
·
loading
AI Generated
Computer Vision
Video Understanding
🏢 Idiap Research Institute
MTGS: a unified framework jointly predicts gaze and social gaze (shared attention, mutual gaze) for multiple people in videos, achieving state-of-the-art results using a temporal transformer model and…