Speech Recognition

SILENCE: Protecting privacy in offloaded speech understanding on resource-constrained devices

26 September 2024·2275 words·11 mins· loading · loading

Natural Language Processing Speech Recognition 🏢 Peking University

SILENCE, a novel lightweight system, protects user privacy in offloaded speech understanding on resource-constrained devices by selectively masking short-term audio details without impacting long-term…

Separate and Reconstruct: Asymmetric Encoder-Decoder for Speech Separation

26 September 2024·3807 words·18 mins· loading · loading

AI Generated Speech and Audio Speech Recognition 🏢 Sogang University

SepReformer: Asymmetric encoder-decoder model for efficient speech separation, achieving state-of-the-art performance with less computation.

Self-Taught Recognizer: Toward Unsupervised Adaptation for Speech Foundation Models

26 September 2024·2366 words·12 mins· loading · loading

Natural Language Processing Speech Recognition 🏢 NVIDIA Research

STAR, a novel unsupervised adaptation framework, drastically improves automatic speech recognition (ASR) robustness across diverse domains using only unlabeled data and outperforms existing self-train…

REBORN: Reinforcement-Learned Boundary Segmentation with Iterative Training for Unsupervised ASR

26 September 2024·2781 words·14 mins· loading · loading

AI Generated Natural Language Processing Speech Recognition 🏢 National Taiwan University

REBORN: An iterative training framework significantly improves unsupervised ASR by learning optimal speech segment boundaries using reinforcement learning, outperforming existing methods.

CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing

26 September 2024·2545 words·12 mins· loading · loading

Speech and Audio Speech Recognition 🏢 Johns Hopkins University

CA-SSLR: a novel self-supervised learning model dynamically adapts to various speech tasks by integrating language and speaker embeddings, improving performance and reducing reliance on audio features…

Aligner-Encoders: Self-Attention Transformers Can Be Self-Transducers

26 September 2024·2029 words·10 mins· loading · loading

Speech Recognition 🏢 Google

Transformers can now perform self-alignment, enabling simpler, faster speech recognition models.