Skip to main content

🏢 Imperial College

Unified Speech Recognition: A Single Model for Auditory, Visual, and Audiovisual Inputs
·2851 words·14 mins· loading · loading
Multimodal Learning Audio-Visual Learning 🏢 Imperial College
One model to rule them all! This paper introduces Unified Speech Recognition (USR), a single model trained for auditory, visual, and audiovisual speech recognition, achieving state-of-the-art results …