Speech and Audio
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution
·1883 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Speech and Audio
Audio Generation
🏢 Alibaba Group
HiFi-SR: A unified generative network achieves high-fidelity speech super-resolution, outperforming existing methods by seamlessly integrating transformer and convolutional components for end-to-end a…
XMusic: Towards a Generalized and Controllable Symbolic Music Generation Framework
·3087 words·15 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Speech and Audio
Music Generation
🏢 Tencent AI Lab
XMusic: A new framework generates high-quality, emotionally controllable symbolic music from various prompts (images, videos, text, tags, humming).
Whisper-GPT: A Hybrid Representation Audio Large Language Model
·1640 words·8 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Speech and Audio
Audio Generation
🏢 Stanford University
Whisper-GPT, a hybrid audio LLM, improves music/speech generation by combining audio waveforms and text.