Computer Vision

Learning Low-Rank Feature for Thorax Disease Classification

26 September 2024·3584 words·17 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 School of Computing and Augmented Intelligence, Arizona State University

Low-Rank Feature Learning (LRFL) significantly boosts thorax disease classification accuracy by reducing noise and background interference in medical images.

Learning Interaction-aware 3D Gaussian Splatting for One-shot Hand Avatars

26 September 2024·2288 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 Shenzhen Campus of Sun Yat-Sen University

Create animatable interacting hand avatars from a single image using a novel two-stage interaction-aware 3D Gaussian splatting framework!

Learning Image Priors Through Patch-Based Diffusion Models for Solving Inverse Problems

26 September 2024·3556 words·17 mins· loading · loading

Computer Vision Image Generation 🏢 University of Michigan

PaDIS: Patch-based diffusion inverse solver learns efficient image priors from image patches, enabling high-resolution inverse problem solutions with reduced computational costs and data needs.

Learning Group Actions on Latent Representations

26 September 2024·2124 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 University of Virginia

This paper proposes a novel method to model group actions within autoencoders by learning these actions in the latent space, enhancing model versatility and improving performance in various real-world…

Learning from Pattern Completion: Self-supervised Controllable Generation

26 September 2024·3650 words·18 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Peking University

Self-Supervised Controllable Generation (SCG) framework achieves brain-like associative generation by using a modular autoencoder with equivariance constraints and a self-supervised pattern completion…

Learning from Offline Foundation Features with Tensor Augmentations

26 September 2024·1797 words·9 mins· loading · loading

Computer Vision Image Classification 🏢 KTH Royal Institute of Technology

LOFF-TA leverages offline foundation model features and tensor augmentations for efficient, resource-light training, achieving up to 37x faster training and 26x less GPU memory usage.

Learning Frequency-Adapted Vision Foundation Model for Domain Generalized Semantic Segmentation

26 September 2024·2199 words·11 mins· loading · loading

Computer Vision Image Segmentation 🏢 Westlake University

FADA: a novel frequency-adapted learning scheme boosts domain-generalized semantic segmentation by decoupling style and content using Haar wavelets, achieving state-of-the-art results.

Learning Disentangled Representations for Perceptual Point Cloud Quality Assessment via Mutual Information Minimization

26 September 2024·1608 words·8 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Cooperative Medianet Innovation Center, Shanghai Jiao Tong University

DisPA: a novel disentangled representation learning framework for perceptual point cloud quality assessment achieves superior performance by minimizing mutual information between content and distortio…

Learning De-Biased Representations for Remote-Sensing Imagery

26 September 2024·2326 words·11 mins· loading · loading

Computer Vision Object Detection 🏢 Singapore Management University

DebLoRA: A novel unsupervised learning approach debiases LoRA for remote sensing imagery, boosting minor class performance without sacrificing major class accuracy.

Learning Commonality, Divergence and Variety for Unsupervised Visible-Infrared Person Re-identification

26 September 2024·1252 words·6 mins· loading · loading

Computer Vision Person Re-Identification 🏢 Institute of Artificial Intelligence, Xiamen University

Progressive Contrastive Learning with Hard & Dynamic Prototypes (PCLHD) revolutionizes unsupervised visible-infrared person re-identification by effectively capturing data commonality, divergence, and…

Learning Bregman Divergences with Application to Robustness

26 September 2024·2210 words·11 mins· loading · loading

Computer Vision Image Classification 🏢 ETH Zurich

Learned Bregman divergences significantly improve image corruption robustness in adversarial training.

Learning 3D Garment Animation from Trajectories of A Piece of Cloth

26 September 2024·2097 words·10 mins· loading · loading

Computer Vision 3D Vision 🏢 Nanyang Technological University

Animates diverse garments realistically from a single cloth’s trajectory using a disentangled learning approach and Energy Unit Network (EUNet).

Learning 3D Equivariant Implicit Function with Patch-Level Pose-Invariant Representation

26 September 2024·2788 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Xi'an Jiaotong University

3D surface reconstruction revolutionized: PEIF leverages patch-level pose-invariant representations and 3D patch-level equivariance for state-of-the-art accuracy, even with varied poses and datasets!

LCM: Locally Constrained Compact Point Cloud Model for Masked Point Modeling

26 September 2024·2913 words·14 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Tsinghua University

LCM: a novel, locally constrained, compact point cloud model surpasses Transformer-based methods by significantly improving performance and efficiency in various downstream tasks.

Latent Representation Matters: Human-like Sketches in One-shot Drawing Tasks

26 September 2024·3093 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 Artificial and Natural Intelligence Toulouse Institute

AI now draws almost as well as humans, thanks to novel latent diffusion model regularizations that mimic human cognitive biases.

Large Spatial Model: End-to-end Unposed Images to Semantic 3D

26 September 2024·1766 words·9 mins· loading · loading

Computer Vision 3D Vision 🏢 NVIDIA Research

Large Spatial Model (LSM) achieves real-time semantic 3D reconstruction from just two unposed images, unifying multiple 3D vision tasks in a single feed-forward pass.

LAM3D: Large Image-Point Clouds Alignment Model for 3D Reconstruction from Single Image

26 September 2024·2617 words·13 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Australian National University

LAM3D: A novel framework uses point cloud data to boost single-image 3D mesh reconstruction accuracy, achieving state-of-the-art results in just 6 seconds.

L4GM: Large 4D Gaussian Reconstruction Model

26 September 2024·2618 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Toronto

L4GM: The first 4D model generating high-quality animated 3D objects from single-view videos in a single feed-forward pass.

L-TTA: Lightweight Test-Time Adaptation Using a Versatile Stem Layer

26 September 2024·2871 words·14 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 Seoul National University of Science and Technology

L-TTA: A lightweight test-time adaptation method using a versatile stem layer minimizes channel-wise uncertainty for rapid and memory-efficient adaptation to new domains.

KOALA: Empirical Lessons Toward Memory-Efficient and Fast Diffusion Models for Text-to-Image Synthesis

26 September 2024·5238 words·25 mins· loading · loading

Computer Vision Image Generation 🏢 Electronics and Telecommunications Research Institute

KOALA: New efficient text-to-image diffusion models achieving 4x speed and 69% size reduction of SDXL, generating 1024px images on consumer GPUs with 8GB VRAM.