Skip to main content

Computer Vision

CAT: Coordinating Anatomical-Textual Prompts for Multi-Organ and Tumor Segmentation
·2794 words·14 mins· loading · loading
AI Generated Computer Vision Image Segmentation 🏢 Qing Yuan Research Institute, Shanghai Jiao Tong University
CAT: A novel dual-prompt model coordinates anatomical and textual prompts for superior multi-organ & tumor segmentation in medical imaging, overcoming limitations of single-prompt methods.
Can We Leave Deepfake Data Behind in Training Deepfake Detector?
·2627 words·13 mins· loading · loading
Computer Vision Face Recognition 🏢 Tencent AI Lab
ProDet: Deepfake detection enhanced by progressively organizing blendfake and deepfake data in the latent space, improving generalization and robustness.
Can Simple Averaging Defeat Modern Watermarks?
·3146 words·15 mins· loading · loading
Computer Vision Image Generation 🏢 National University of Singapore
Simple averaging of watermarked images reveals hidden patterns, enabling watermark removal and forgery, thus highlighting the vulnerability of content-agnostic watermarking methods.
Bridging the Divide: Reconsidering Softmax and Linear Attention
·2335 words·11 mins· loading · loading
Computer Vision Image Classification 🏢 Tsinghua University
InLine attention, a novel method, bridges the performance gap between softmax and linear attention by incorporating injectivity and local modeling, achieving superior performance while maintaining lin…
Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack
·4173 words·20 mins· loading · loading
AI Generated Computer Vision Image Classification 🏢 School of Data Science,The Chinese University of Hong Kong
Researchers discover that existing backdoor defenses leave vulnerabilities, allowing for easy re-activation of backdoors through subtle trigger manipulation.
Breaking Semantic Artifacts for Generalized AI-generated Image Detection
·3098 words·15 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 School of Cyber Science and Engineering, Xi'an Jiaotong University
Researchers developed a new AI-generated image detection method that overcomes the limitation of existing detectors, achieving superior cross-scene generalization by shuffling image patches and traini…
BrainBits: How Much of the Brain are Generative Reconstruction Methods Using?
·2562 words·13 mins· loading · loading
Computer Vision Image Generation 🏢 MIT
BrainBits reveals that surprisingly little brain information is needed for high-fidelity image & text reconstruction, highlighting the dominance of generative model priors over neural signal extractio…
Boundary Matters: A Bi-Level Active Finetuning Method
·2351 words·12 mins· loading · loading
Computer Vision Active Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University
Bi-Level Active Finetuning Framework (BiLAF) revolutionizes sample selection for efficient model finetuning. Unlike existing methods, BiLAF incorporates both global diversity and local decision bounda…
Bootstrapping Top-down Information for Self-modulating Slot Attention
·2042 words·10 mins· loading · loading
Computer Vision Object Detection 🏢 POSTECH
This paper introduces a novel object-centric learning (OCL) framework that enhances slot attention with a self-modulating top-down pathway, significantly improving object representation and achieving …
Boosting the Transferability of Adversarial Attack on Vision Transformer with Adaptive Token Tuning
·2792 words·14 mins· loading · loading
Computer Vision Adversarial Attacks 🏢 Chongqing University of Technology
Boosting vision transformer adversarial attack transferability, this paper introduces Adaptive Token Tuning (ATT), improving attack success rate by 10.1% over existing methods.
BOLD: Boolean Logic Deep Learning
·3864 words·19 mins· loading · loading
Computer Vision Image Classification 🏢 Huawei Paris Research Center
Boolean Logic Deep Learning (BOLD) revolutionizes deep learning by enabling training with Boolean weights and activations, achieving state-of-the-art accuracy with drastically reduced energy consumpti…
Blind Image Restoration via Fast Diffusion Inversion
·1974 words·10 mins· loading · loading
Computer Vision Image Generation 🏢 Computer Vision Group, Institute of Informatics, University of Bern, Switzerland
BIRD: a novel blind image restoration method jointly optimizes degradation model parameters and the restored image, ensuring realistic outputs via fast diffusion inversion and achieving state-of-the-a…
BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference
·3033 words·15 mins· loading · loading
Computer Vision Image Generation 🏢 University of Michigan
BLAST matrix learns efficient weight structures for faster deep learning inference, achieving significant compression and performance gains on various models.
BitsFusion: 1.99 bits Weight Quantization of Diffusion Model
·5994 words·29 mins· loading · loading
AI Generated Computer Vision Image Generation 🏢 Snap Inc.
BitsFusion achieves 7.9x smaller Stable Diffusion models by quantizing UNet weights to 1.99 bits, surprisingly improving image generation quality!
bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction
·3565 words·17 mins· loading · loading
Computer Vision Video Understanding 🏢 Case Western Reserve University
bit2bit reconstructs high-quality videos from sparse, binary quanta image sensor data using self-supervised photon location prediction, significantly improving resolution and usability.
Biologically-Inspired Learning Model for Instructed Vision
·2864 words·14 mins· loading · loading
Computer Vision Image Classification 🏢 Weizmann Institute of Science
Biologically-inspired AI model integrates learning & visual guidance via a novel ‘Counter-Hebb’ learning mechanism, achieving competitive performance on multi-task learning benchmarks.
Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis
·2801 words·14 mins· loading · loading
Computer Vision 3D Vision 🏢 Tsinghua University
Binocular-guided 3D Gaussian splatting with self-supervision generates high-quality novel views from sparse inputs without external priors, significantly outperforming state-of-the-art methods.
Binarized Diffusion Model for Image Super-Resolution
·1567 words·8 mins· loading · loading
Computer Vision Image Generation 🏢 ETH Zurich
BI-DiffSR, a novel binarized diffusion model, achieves high-quality image super-resolution with significantly reduced memory and computational costs, outperforming existing methods.
BiDM: Pushing the Limit of Quantization for Diffusion Models
·2390 words·12 mins· loading · loading
Computer Vision Image Generation 🏢 Beihang University
BiDM achieves full 1-bit quantization in diffusion models, significantly improving storage and speed without sacrificing image quality, setting a new state-of-the-art.
Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding
·2178 words·11 mins· loading · loading
Computer Vision Image Segmentation 🏢 Hong Kong University of Science and Technology
GPTrack: A novel unsupervised framework enhances cardiac motion tracking by using sequential Gaussian processes and bidirectional recurrence, improving accuracy and efficiency.