Computer Vision

Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner

26 September 2024·2916 words·14 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Beijing University of Posts and Telecommunications

LucidDrag: Semantic-aware dragging transforms image editing with an intention reasoner and collaborative guidance, achieving superior accuracy, image fidelity, and semantic diversity.

LiteVAE: Lightweight and Efficient Variational Autoencoders for Latent Diffusion Models

26 September 2024·3223 words·16 mins· loading · loading

Computer Vision Image Generation 🏢 ETH Zurich

LiteVAE: A new autoencoder design for latent diffusion models boosts efficiency sixfold without sacrificing image quality, achieving faster training and lower memory needs via the 2D discrete wavelet …

LION: Linear Group RNN for 3D Object Detection in Point Clouds

26 September 2024·3911 words·19 mins· loading · loading

AI Generated Computer Vision Object Detection 🏢 Huazhong University of Science and Technology

LION: Linear Group RNNs conquer 3D object detection in sparse point clouds by enabling efficient long-range feature interaction, significantly outperforming transformer-based methods.

LinNet: Linear Network for Efficient Point Cloud Representation Learning

26 September 2024·2362 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Northwest University

LinNet: A linear-time point cloud network achieving 10x speedup over PointNeXt, with state-of-the-art accuracy on various benchmarks.

Linearly Decomposing and Recomposing Vision Transformers for Diverse-Scale Models

26 September 2024·2125 words·10 mins· loading · loading

Computer Vision Image Classification 🏢 School of Computer Science and Engineering, Southeast University

Linearly decompose & recompose Vision Transformers to create diverse-scale models efficiently, reducing computational costs & improving flexibility for various applications.

Lightweight Frequency Masker for Cross-Domain Few-Shot Semantic Segmentation

26 September 2024·3232 words·16 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 Huazhong University of Science and Technology

Lightweight Frequency Masker significantly improves cross-domain few-shot semantic segmentation by cleverly filtering frequency components of images, thereby reducing inter-channel correlation and enh…

Lighting Every Darkness with 3DGS: Fast Training and Real-Time Rendering for HDR View Synthesis

26 September 2024·3953 words·19 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 Nankai University

LE3D: Real-time HDR view synthesis from noisy RAW images is achieved using 3DGS, significantly reducing training time and improving rendering speed.

LG-CAV: Train Any Concept Activation Vector with Language Guidance

26 September 2024·3860 words·19 mins· loading · loading

AI Generated Computer Vision Vision-Language Models 🏢 Zhejiang University

LG-CAV: Train any Concept Activation Vector with Language Guidance, leverages vision-language models to train CAVs without labeled data, achieving superior accuracy and enabling state-of-the-art model…

Lexicon3D: Probing Visual Foundation Models for Complex 3D Scene Understanding

26 September 2024·2610 words·13 mins· loading · loading

Computer Vision Scene Understanding 🏢 University of Illinois Urbana-Champaign

Lexicon3D: a first comprehensive study probing diverse visual foundation models for superior 3D scene understanding, revealing that unsupervised image models outperform others across various tasks.

Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation

26 September 2024·3134 words·15 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 School of Electronic Engineering and Computer Science, Queen Mary University of London

ProMaC leverages MLLM hallucinations in an iterative framework to generate precise prompts for accurate object segmentation, minimizing manual prompt dependency.

Learning-to-Cache: Accelerating Diffusion Transformer via Layer Caching

26 September 2024·2463 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 National University of Singapore

Learning-to-Cache (L2C) dramatically accelerates diffusion transformers by intelligently caching layer computations, achieving significant speedups with minimal performance loss.

Learning Where to Edit Vision Transformers

26 September 2024·3346 words·16 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 City University of Hong Kong

Meta-learning a hypernetwork on CutMix-augmented data enables data-efficient and precise correction of vision transformer errors by identifying optimal parameters for fine-tuning.

Learning Truncated Causal History Model for Video Restoration

26 September 2024·2473 words·12 mins· loading · loading

Computer Vision Video Understanding 🏢 University of Alberta

TURTLE: a novel video restoration framework that learns a truncated causal history model for efficient and high-performing video restoration, achieving state-of-the-art results on various benchmark ta…

Learning Transferable Features for Implicit Neural Representations

26 September 2024·4038 words·19 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Rice University

STRAINER: A new framework enabling faster, higher-quality INR fitting by leveraging transferable features across similar signals, significantly boosting INR performance.

Learning to Merge Tokens via Decoupled Embedding for Efficient Vision Transformers

26 September 2024·3286 words·16 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 KAIST

Decoupled Token Embedding for Merging (DTEM) significantly improves Vision Transformer efficiency by using a decoupled embedding module for relaxed token merging, achieving consistent performance gain…

Learning to Edit Visual Programs with Self-Supervision

26 September 2024·2121 words·10 mins· loading · loading

Computer Vision Visual Question Answering 🏢 Brown University

AI learns to edit visual programs more accurately using a self-supervised method that combines one-shot program generation with iterative local edits, significantly boosting performance, especially wi…

Learning to Decouple the Lights for 3D Face Texture Modeling

26 September 2024·3463 words·17 mins· loading · loading

Computer Vision Face Recognition 🏢 School of Computing, National University of Singapore

Researchers developed Light Decoupling, a novel approach to model 3D facial textures under complex illumination, achieving more realistic and accurate results by decoupling unnatural lighting into mul…

Learning to be Smooth: An End-to-End Differentiable Particle Smoother

26 September 2024·2507 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 UC Irvine

Learned Mixture Density Particle Smoother (MDPS) surpasses state-of-the-art for accurate, differentiable city-scale vehicle localization.

Learning Structured Representations with Hyperbolic Embeddings

26 September 2024·3560 words·17 mins· loading · loading

Computer Vision Representation Learning 🏢 University of Illinois, Urbana-Champaign

HypStructure boosts representation learning by embedding label hierarchies into hyperbolic space, improving accuracy and interpretability.

Learning Optimal Lattice Vector Quantizers for End-to-end Neural Image Compression

26 September 2024·1532 words·8 mins· loading · loading

Computer Vision Image Compression 🏢 Department of Electronic Engineering, Shanghai Jiao Tong University

Learned optimal lattice vector quantization (OLVQ) drastically boosts neural image compression efficiency by adapting quantizer structures to latent feature distributions, achieving significant rate-d…