Skip to main content

🏢 Institute of Information Engineering, Chinese Academy of Sciences

TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
·2216 words·11 mins· loading · loading
Image Generation 🏢 Institute of Information Engineering, Chinese Academy of Sciences
TextCtrl: a novel diffusion-based scene text editing method using prior guidance control, achieving superior style fidelity and accuracy with a new real-world benchmark dataset, ScenePair.
SpeechForensics: Audio-Visual Speech Representation Learning for Face Forgery Detection
·2151 words·11 mins· loading · loading
Computer Vision Face Recognition 🏢 Institute of Information Engineering, Chinese Academy of Sciences
SpeechForensics leverages audio-visual speech representation learning to achieve superior face forgery detection, outperforming state-of-the-art methods in cross-dataset generalization and robustness.
Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features
·3306 words·16 mins· loading · loading
Image Generation 🏢 Institute of Information Engineering, Chinese Academy of Sciences
Unlocking superior discriminative features from diffusion models, this research reveals key activation properties for effective feature selection, surpassing state-of-the-art methods.