🏢 Microsoft
TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
·1866 words·9 mins·
loading
·
loading
Natural Language Processing
Machine Translation
🏢 Microsoft
TransVIP: groundbreaking speech-to-speech translation system preserving voice & isochrony, outperforming current state-of-the-art models!
Motion Graph Unleashed: A Novel Approach to Video Prediction
·2948 words·14 mins·
loading
·
loading
Computer Vision
Video Understanding
🏢 Microsoft
Motion Graph unleashes efficient and accurate video prediction by transforming video frames into interconnected graph nodes, capturing complex motion patterns with minimal computational cost.
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
·2797 words·14 mins·
loading
·
loading
AI Generated
Computer Vision
Video Understanding
🏢 Microsoft
Boosting video diffusion: Motion Consistency Model (MCM) disentangles motion and appearance learning for high-fidelity, fast video generation using few sampling steps.
Make Your LLM Fully Utilize the Context
·2445 words·12 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Microsoft
FILM-7B, trained with Information-Intensive (IN2) training, significantly overcomes the ’lost-in-the-middle’ problem in long-context LLMs, enabling robust information retrieval from all context positi…