🏢 Alibaba Group
EMO2: End-Effector Guided Audio-Driven Avatar Video Generation
·2205 words·11 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Alibaba Group
EMO2 achieves realistic audio-driven avatar video generation by employing a two-stage framework: first generating hand poses directly from audio and then using a diffusion model to synthesize full-bod…
Textoon: Generating Vivid 2D Cartoon Characters from Text Descriptions
·2057 words·10 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Alibaba Group
Textoon: Generating vivid 2D cartoon characters from text descriptions in under a minute, revolutionizing animation workflow.
HiFi-SR: A Unified Generative Transformer-Convolutional Adversarial Network for High-Fidelity Speech Super-Resolution
·1883 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Speech and Audio
Audio Generation
🏢 Alibaba Group
HiFi-SR: A unified generative network achieves high-fidelity speech super-resolution, outperforming existing methods by seamlessly integrating transformer and convolutional components for end-to-end a…
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings
·2397 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Alibaba Group
CODEELO benchmark uses CodeForces to fairly evaluate LLMs’ coding abilities, providing human-comparable Elo ratings and addressing limitations of existing benchmarks.
Evaluating and Aligning CodeLLMs on Human Preference
·3535 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Alibaba Group
CodeArena, a novel benchmark, evaluates code LLMs based on human preferences, revealing performance gaps between open-source and proprietary models, and a large-scale synthetic instruction corpus impr…
A Simple and Provable Scaling Law for the Test-Time Compute of Large Language Models
·1730 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Alibaba Group
Boost LLM accuracy exponentially by using a two-stage algorithm with provable scaling laws: generate multiple candidate solutions then compare them in a knockout tournament!
Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model
·271 words·2 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Alibaba Group
TeaCache: a training-free method boosts video diffusion model speed by up to 4.41x with minimal quality loss by cleverly caching intermediate outputs.
M2rc-Eval: Massively Multilingual Repository-level Code Completion Evaluation
·4787 words·23 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Alibaba Group
M2RC-EVAL: A new massively multilingual benchmark for repository-level code completion, featuring fine-grained annotations and a large instruction dataset, enabling better evaluation of code LLMs acro…