Computer Vision

FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner

26 September 2024·1980 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 Tsinghua University

FlowTurbo: Blazing-fast, high-quality flow-based image generation via a velocity refiner!

Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception

26 September 2024·2635 words·13 mins· loading · loading

Computer Vision Action Recognition 🏢 College of Computing and Data Science, Nanyang Technological University, Singapore

Deep neural networks finally match human biological motion perception capabilities by leveraging patch-level optical flows and innovative neuron designs, achieving a 29% accuracy improvement.

Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching

26 September 2024·2171 words·11 mins· loading · loading

Computer Vision Image Generation 🏢 UC Los Angeles

ICTM efficiently solves linear inverse problems using flow priors by iteratively optimizing local MAP objectives, outperforming other flow-based methods.

Flexible Context-Driven Sensory Processing in Dynamical Vision Models

26 September 2024·2040 words·10 mins· loading · loading

Computer Vision Vision-Language Models 🏢 MIT

Biologically-inspired DCnet neural network flexibly modulates visual processing based on context, outperforming existing models on visual search and attention tasks.

Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM

26 September 2024·2042 words·10 mins· loading · loading

Computer Vision Image Segmentation 🏢 Chinese University of Hong Kong

A-SAM: Turning SAM’s inherent ambiguity into an advantage for controllable, diverse, and convincing ambiguous object segmentation.

Flatten Anything: Unsupervised Neural Surface Parameterization

26 September 2024·2390 words·12 mins· loading · loading

Computer Vision 3D Vision 🏢 Department of Computer Science, City University of Hong Kong

Flatten Anything Model (FAM) revolutionizes neural surface parameterization with unsupervised learning, handling complex topologies and unstructured data fully automatically.

Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models

26 September 2024·4174 words·20 mins· loading · loading

AI Generated Computer Vision 3D Vision 🏢 City University of Hong Kong

OLIVINE uses visual foundation models for fine-grained image-to-LiDAR contrastive distillation, mitigating self-conflict issues and improving 3D representation learning.

Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models

26 September 2024·3454 words·17 mins· loading · loading

Computer Vision Image Generation 🏢 German Research Center for Artificial Intelligence

NEMO pinpoints & deactivates neurons memorizing training data in diffusion models, boosting privacy & image diversity.

FIFO-Diffusion: Generating Infinite Videos from Text without Training

26 September 2024·3112 words·15 mins· loading · loading

Computer Vision Video Understanding 🏢 Seoul National University

FIFO-Diffusion generates infinitely long, high-quality videos from text prompts using a pretrained model, solving the challenge of long video generation without retraining.

FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors

26 September 2024·2339 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 School of Computer Science and Engineering, Sun Yat-Sen University

FFAM uses feature factorization and gradient weighting to produce high-quality visual explanations for 3D object detectors, improving model interpretability and trust.

FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training

26 September 2024·2255 words·11 mins· loading · loading

Computer Vision 3D Vision 🏢 University of Amsterdam

FewViewGS: A novel method for high-quality novel view synthesis from sparse images using a multi-stage training scheme and a new locality-preserving regularization for 3D Gaussians.

Fetch and Forge: Efficient Dataset Condensation for Object Detection

26 September 2024·1843 words·9 mins· loading · loading

Computer Vision Object Detection 🏢 Tencent Youtu Lab

DCOD, a novel two-stage framework (Fetch & Forge), efficiently condenses object detection datasets, achieving comparable performance to full datasets at extremely low compression rates, significantly …

Federated Black-Box Adaptation for Semantic Segmentation

26 September 2024·3023 words·15 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 Johns Hopkins University

BlackFed: Privacy-preserving federated semantic segmentation using zero/first-order optimization, avoiding gradient/weight sharing!

Feature-Level Adversarial Attacks and Ranking Disruption for Visible-Infrared Person Re-identification

26 September 2024·1748 words·9 mins· loading · loading

Computer Vision Face Recognition 🏢 Xidian University

New feature-level adversarial attacks disrupt visible-infrared person re-identification (VIReID) systems by cleverly aligning and manipulating features to cause incorrect ranking results.

FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification

26 September 2024·2727 words·13 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Huazhong University of Science and Technology

FasterDiT accelerates Diffusion Transformers training 7x without architecture modification by analyzing SNR probability density functions and implementing a new supervision method.

Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level

26 September 2024·2848 words·14 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 SHI Labs @ Georgia Tech

This research dramatically accelerates neighborhood attention, a cost-effective self-attention mechanism, through novel GEMM-based and fused kernel implementations, boosting performance by up to 1759%…

Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference

26 September 2024·5119 words·25 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Nankai University

Faster Diffusion achieves significant speedups in diffusion model inference by cleverly reusing encoder features and enabling parallel processing, eliminating the need for computationally expensive di…

FastDrag: Manipulate Anything in One Step

26 September 2024·2454 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 College of Computer Science and Technology, Harbin Engineering University

FastDrag: One-step image manipulation using generative models, drastically improving editing speed without sacrificing quality.

Fast samplers for Inverse Problems in Iterative Refinement models

26 September 2024·3647 words·18 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 UC Irvine

Conditional Conjugate Integrators (CCI) drastically accelerate sampling in iterative refinement models for inverse problems, achieving high-quality results with only a few steps.

Fast Encoder-Based 3D from Casual Videos via Point Track Processing

26 September 2024·2766 words·13 mins· loading · loading

Computer Vision 3D Vision 🏢 NVIDIA Research

TRACKSTO4D: Fast & accurate 3D reconstruction from casual videos using 2D point tracks, drastically reducing runtime by up to 95% while matching state-of-the-art accuracy.