Computer Vision
FlowTurbo: Towards Real-time Flow-Based Image Generation with Velocity Refiner
·1980 words·10 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 Tsinghua University
FlowTurbo: Blazing-fast, high-quality flow-based image generation via a velocity refiner!
Flow Snapshot Neurons in Action: Deep Neural Networks Generalize to Biological Motion Perception
·2635 words·13 mins·
loading
·
loading
Computer Vision
Action Recognition
🏢 College of Computing and Data Science, Nanyang Technological University, Singapore
Deep neural networks finally match human biological motion perception capabilities by leveraging patch-level optical flows and innovative neuron designs, achieving a 29% accuracy improvement.
Flow Priors for Linear Inverse Problems via Iterative Corrupted Trajectory Matching
·2171 words·11 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 UC Los Angeles
ICTM efficiently solves linear inverse problems using flow priors by iteratively optimizing local MAP objectives, outperforming other flow-based methods.
Flexible Context-Driven Sensory Processing in Dynamical Vision Models
·2040 words·10 mins·
loading
·
loading
Computer Vision
Vision-Language Models
🏢 MIT
Biologically-inspired DCnet neural network flexibly modulates visual processing based on context, outperforming existing models on visual search and attention tasks.
Flaws can be Applause: Unleashing Potential of Segmenting Ambiguous Objects in SAM
·2042 words·10 mins·
loading
·
loading
Computer Vision
Image Segmentation
🏢 Chinese University of Hong Kong
A-SAM: Turning SAM’s inherent ambiguity into an advantage for controllable, diverse, and convincing ambiguous object segmentation.
Flatten Anything: Unsupervised Neural Surface Parameterization
·2390 words·12 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 Department of Computer Science, City University of Hong Kong
Flatten Anything Model (FAM) revolutionizes neural surface parameterization with unsupervised learning, handling complex topologies and unstructured data fully automatically.
Fine-grained Image-to-LiDAR Contrastive Distillation with Visual Foundation Models
·4174 words·20 mins·
loading
·
loading
AI Generated
Computer Vision
3D Vision
🏢 City University of Hong Kong
OLIVINE uses visual foundation models for fine-grained image-to-LiDAR contrastive distillation, mitigating self-conflict issues and improving 3D representation learning.
Finding NeMo: Localizing Neurons Responsible For Memorization in Diffusion Models
·3454 words·17 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 German Research Center for Artificial Intelligence
NEMO pinpoints & deactivates neurons memorizing training data in diffusion models, boosting privacy & image diversity.
FIFO-Diffusion: Generating Infinite Videos from Text without Training
·3112 words·15 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 Seoul National University
FIFO-Diffusion generates infinitely long, high-quality videos from text prompts using a pretrained model, solving the challenge of long video generation without retraining.
FFAM: Feature Factorization Activation Map for Explanation of 3D Detectors
·2339 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 School of Computer Science and Engineering, Sun Yat-Sen University
FFAM uses feature factorization and gradient weighting to produce high-quality visual explanations for 3D object detectors, improving model interpretability and trust.
FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training
·2255 words·11 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 University of Amsterdam
FewViewGS: A novel method for high-quality novel view synthesis from sparse images using a multi-stage training scheme and a new locality-preserving regularization for 3D Gaussians.
Fetch and Forge: Efficient Dataset Condensation for Object Detection
·1843 words·9 mins·
loading
·
loading
Computer Vision
Object Detection
🏢 Tencent Youtu Lab
DCOD, a novel two-stage framework (Fetch & Forge), efficiently condenses object detection datasets, achieving comparable performance to full datasets at extremely low compression rates, significantly …
Federated Black-Box Adaptation for Semantic Segmentation
·3023 words·15 mins·
loading
·
loading
AI Generated
Computer Vision
Image Segmentation
🏢 Johns Hopkins University
BlackFed: Privacy-preserving federated semantic segmentation using zero/first-order optimization, avoiding gradient/weight sharing!
Feature-Level Adversarial Attacks and Ranking Disruption for Visible-Infrared Person Re-identification
·1748 words·9 mins·
loading
·
loading
Computer Vision
Face Recognition
🏢 Xidian University
New feature-level adversarial attacks disrupt visible-infrared person re-identification (VIReID) systems by cleverly aligning and manipulating features to cause incorrect ranking results.
FasterDiT: Towards Faster Diffusion Transformers Training without Architecture Modification
·2727 words·13 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Huazhong University of Science and Technology
FasterDiT accelerates Diffusion Transformers training 7x without architecture modification by analyzing SNR probability density functions and implementing a new supervision method.
Faster Neighborhood Attention: Reducing the O(n^2) Cost of Self Attention at the Threadblock Level
·2848 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
Image Classification
🏢 SHI Labs @ Georgia Tech
This research dramatically accelerates neighborhood attention, a cost-effective self-attention mechanism, through novel GEMM-based and fused kernel implementations, boosting performance by up to 1759%…
Faster Diffusion: Rethinking the Role of the Encoder for Diffusion Model Inference
·5119 words·25 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 Nankai University
Faster Diffusion achieves significant speedups in diffusion model inference by cleverly reusing encoder features and enabling parallel processing, eliminating the need for computationally expensive di…
FastDrag: Manipulate Anything in One Step
·2454 words·12 mins·
loading
·
loading
Computer Vision
Image Generation
🏢 College of Computer Science and Technology, Harbin Engineering University
FastDrag: One-step image manipulation using generative models, drastically improving editing speed without sacrificing quality.
Fast samplers for Inverse Problems in Iterative Refinement models
·3647 words·18 mins·
loading
·
loading
AI Generated
Computer Vision
Image Generation
🏢 UC Irvine
Conditional Conjugate Integrators (CCI) drastically accelerate sampling in iterative refinement models for inverse problems, achieving high-quality results with only a few steps.
Fast Encoder-Based 3D from Casual Videos via Point Track Processing
·2766 words·13 mins·
loading
·
loading
Computer Vision
3D Vision
🏢 NVIDIA Research
TRACKSTO4D: Fast & accurate 3D reconstruction from casual videos using 2D point tracks, drastically reducing runtime by up to 95% while matching state-of-the-art accuracy.