Skip to main content

🏢 Microsoft

TransVIP: Speech to Speech Translation System with Voice and Isochrony Preservation
·1866 words·9 mins· loading · loading
Natural Language Processing Machine Translation 🏢 Microsoft
TransVIP: groundbreaking speech-to-speech translation system preserving voice & isochrony, outperforming current state-of-the-art models!
Motion Graph Unleashed: A Novel Approach to Video Prediction
·2948 words·14 mins· loading · loading
Computer Vision Video Understanding 🏢 Microsoft
Motion Graph unleashes efficient and accurate video prediction by transforming video frames into interconnected graph nodes, capturing complex motion patterns with minimal computational cost.
Motion Consistency Model: Accelerating Video Diffusion with Disentangled Motion-Appearance Distillation
·2797 words·14 mins· loading · loading
AI Generated Computer Vision Video Understanding 🏢 Microsoft
Boosting video diffusion: Motion Consistency Model (MCM) disentangles motion and appearance learning for high-fidelity, fast video generation using few sampling steps.
Make Your LLM Fully Utilize the Context
·2445 words·12 mins· loading · loading
Natural Language Processing Large Language Models 🏢 Microsoft
FILM-7B, trained with Information-Intensive (IN2) training, significantly overcomes the ’lost-in-the-middle’ problem in long-context LLMs, enabling robust information retrieval from all context positi…