Computer Vision

CAT: Coordinating Anatomical-Textual Prompts for Multi-Organ and Tumor Segmentation

26 September 2024·2794 words·14 mins· loading · loading

AI Generated Computer Vision Image Segmentation 🏢 Qing Yuan Research Institute, Shanghai Jiao Tong University

CAT: A novel dual-prompt model coordinates anatomical and textual prompts for superior multi-organ & tumor segmentation in medical imaging, overcoming limitations of single-prompt methods.

Can We Leave Deepfake Data Behind in Training Deepfake Detector?

26 September 2024·2627 words·13 mins· loading · loading

Computer Vision Face Recognition 🏢 Tencent AI Lab

ProDet: Deepfake detection enhanced by progressively organizing blendfake and deepfake data in the latent space, improving generalization and robustness.

Can Simple Averaging Defeat Modern Watermarks?

26 September 2024·3146 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 National University of Singapore

Simple averaging of watermarked images reveals hidden patterns, enabling watermark removal and forgery, thus highlighting the vulnerability of content-agnostic watermarking methods.

Bridging the Divide: Reconsidering Softmax and Linear Attention

26 September 2024·2335 words·11 mins· loading · loading

Computer Vision Image Classification 🏢 Tsinghua University

InLine attention, a novel method, bridges the performance gap between softmax and linear attention by incorporating injectivity and local modeling, achieving superior performance while maintaining lin…

Breaking the False Sense of Security in Backdoor Defense through Re-Activation Attack

26 September 2024·4173 words·20 mins· loading · loading

AI Generated Computer Vision Image Classification 🏢 School of Data Science,The Chinese University of Hong Kong

Researchers discover that existing backdoor defenses leave vulnerabilities, allowing for easy re-activation of backdoors through subtle trigger manipulation.

Breaking Semantic Artifacts for Generalized AI-generated Image Detection

26 September 2024·3098 words·15 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 School of Cyber Science and Engineering, Xi'an Jiaotong University

Researchers developed a new AI-generated image detection method that overcomes the limitation of existing detectors, achieving superior cross-scene generalization by shuffling image patches and traini…

BrainBits: How Much of the Brain are Generative Reconstruction Methods Using?

26 September 2024·2562 words·13 mins· loading · loading

Computer Vision Image Generation 🏢 MIT

BrainBits reveals that surprisingly little brain information is needed for high-fidelity image & text reconstruction, highlighting the dominance of generative model priors over neural signal extractio…

Boundary Matters: A Bi-Level Active Finetuning Method

26 September 2024·2351 words·12 mins· loading · loading

Computer Vision Active Learning 🏢 Dept. of CSE & School of AI & MoE Key Lab of AI, Shanghai Jiao Tong University

Bi-Level Active Finetuning Framework (BiLAF) revolutionizes sample selection for efficient model finetuning. Unlike existing methods, BiLAF incorporates both global diversity and local decision bounda…

Bootstrapping Top-down Information for Self-modulating Slot Attention

26 September 2024·2042 words·10 mins· loading · loading

Computer Vision Object Detection 🏢 POSTECH

This paper introduces a novel object-centric learning (OCL) framework that enhances slot attention with a self-modulating top-down pathway, significantly improving object representation and achieving …

Boosting the Transferability of Adversarial Attack on Vision Transformer with Adaptive Token Tuning

26 September 2024·2792 words·14 mins· loading · loading

Computer Vision Adversarial Attacks 🏢 Chongqing University of Technology

Boosting vision transformer adversarial attack transferability, this paper introduces Adaptive Token Tuning (ATT), improving attack success rate by 10.1% over existing methods.

BOLD: Boolean Logic Deep Learning

26 September 2024·3864 words·19 mins· loading · loading

Computer Vision Image Classification 🏢 Huawei Paris Research Center

Boolean Logic Deep Learning (BOLD) revolutionizes deep learning by enabling training with Boolean weights and activations, achieving state-of-the-art accuracy with drastically reduced energy consumpti…

Blind Image Restoration via Fast Diffusion Inversion

26 September 2024·1974 words·10 mins· loading · loading

Computer Vision Image Generation 🏢 Computer Vision Group, Institute of Informatics, University of Bern, Switzerland

BIRD: a novel blind image restoration method jointly optimizes degradation model parameters and the restored image, ensuring realistic outputs via fast diffusion inversion and achieving state-of-the-a…

BLAST: Block-Level Adaptive Structured Matrices for Efficient Deep Neural Network Inference

26 September 2024·3033 words·15 mins· loading · loading

Computer Vision Image Generation 🏢 University of Michigan

BLAST matrix learns efficient weight structures for faster deep learning inference, achieving significant compression and performance gains on various models.

BitsFusion: 1.99 bits Weight Quantization of Diffusion Model

26 September 2024·5994 words·29 mins· loading · loading

AI Generated Computer Vision Image Generation 🏢 Snap Inc.

BitsFusion achieves 7.9x smaller Stable Diffusion models by quantizing UNet weights to 1.99 bits, surprisingly improving image generation quality!

bit2bit: 1-bit quanta video reconstruction via self-supervised photon prediction

26 September 2024·3565 words·17 mins· loading · loading

Computer Vision Video Understanding 🏢 Case Western Reserve University

bit2bit reconstructs high-quality videos from sparse, binary quanta image sensor data using self-supervised photon location prediction, significantly improving resolution and usability.

Biologically-Inspired Learning Model for Instructed Vision

26 September 2024·2864 words·14 mins· loading · loading

Computer Vision Image Classification 🏢 Weizmann Institute of Science

Biologically-inspired AI model integrates learning & visual guidance via a novel ‘Counter-Hebb’ learning mechanism, achieving competitive performance on multi-task learning benchmarks.

Binocular-Guided 3D Gaussian Splatting with View Consistency for Sparse View Synthesis

26 September 2024·2801 words·14 mins· loading · loading

Computer Vision 3D Vision 🏢 Tsinghua University

Binocular-guided 3D Gaussian splatting with self-supervision generates high-quality novel views from sparse inputs without external priors, significantly outperforming state-of-the-art methods.

Binarized Diffusion Model for Image Super-Resolution

26 September 2024·1567 words·8 mins· loading · loading

Computer Vision Image Generation 🏢 ETH Zurich

BI-DiffSR, a novel binarized diffusion model, achieves high-quality image super-resolution with significantly reduced memory and computational costs, outperforming existing methods.

BiDM: Pushing the Limit of Quantization for Diffusion Models

26 September 2024·2390 words·12 mins· loading · loading

Computer Vision Image Generation 🏢 Beihang University

BiDM achieves full 1-bit quantization in diffusion models, significantly improving storage and speed without sacrificing image quality, setting a new state-of-the-art.

Bidirectional Recurrence for Cardiac Motion Tracking with Gaussian Process Latent Coding

26 September 2024·2178 words·11 mins· loading · loading

Computer Vision Image Segmentation 🏢 Hong Kong University of Science and Technology

GPTrack: A novel unsupervised framework enhances cardiac motion tracking by using sequential Gaussian processes and bidirectional recurrence, improving accuracy and efficiency.