🏢 Institute of Information Engineering, Chinese Academy of Sciences
TextCtrl: Diffusion-based Scene Text Editing with Prior Guidance Control
·2216 words·11 mins·
loading
·
loading
Image Generation
🏢 Institute of Information Engineering, Chinese Academy of Sciences
TextCtrl: a novel diffusion-based scene text editing method using prior guidance control, achieving superior style fidelity and accuracy with a new real-world benchmark dataset, ScenePair.
SpeechForensics: Audio-Visual Speech Representation Learning for Face Forgery Detection
·2151 words·11 mins·
loading
·
loading
Computer Vision
Face Recognition
🏢 Institute of Information Engineering, Chinese Academy of Sciences
SpeechForensics leverages audio-visual speech representation learning to achieve superior face forgery detection, outperforming state-of-the-art methods in cross-dataset generalization and robustness.
Not All Diffusion Model Activations Have Been Evaluated as Discriminative Features
·3306 words·16 mins·
loading
·
loading
Image Generation
🏢 Institute of Information Engineering, Chinese Academy of Sciences
Unlocking superior discriminative features from diffusion models, this research reveals key activation properties for effective feature selection, surpassing state-of-the-art methods.