🏢 Nanyang Technological University
RepVideo: Rethinking Cross-Layer Representation for Video Generation
·2785 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Nanyang Technological University
RepVideo enhances text-to-video generation by enriching feature representations, resulting in significantly improved temporal coherence and spatial detail.
SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration
·1895 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Nanyang Technological University
SeedVR: A novel diffusion transformer revolutionizes generic video restoration by efficiently handling arbitrary video lengths and resolutions, achieving state-of-the-art performance.
AntiLeak-Bench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge
·2611 words·13 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Nanyang Technological University
Auto-built benchmark with up-to-date knowledge ensures contamination-free LLM evaluation.
FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion
·2401 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Nanyang Technological University
FreeScale generates stunning 8K images and high-fidelity videos without retraining.
Arbitrary-steps Image Super-resolution via Diffusion Inversion
·3889 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Nanyang Technological University
InvSR: a novel image super-resolution technique using diffusion inversion, enabling flexible sampling steps for efficient and high-fidelity results.
ObjCtrl-2.5D: Training-free Object Control with Camera Poses
·3506 words·17 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Nanyang Technological University
ObjCtrl-2.5D: Training-free, precise image-to-video object control using 3D trajectories and camera poses.
Trajectory Attention for Fine-grained Video Motion Control
·4421 words·21 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Nanyang Technological University
Trajectory Attention enhances video motion control by injecting trajectory information, improving precision and long-range consistency in video generation.
Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis
·3637 words·18 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Nanyang Technological University
Omegance: One parameter precisely controls image detail in diffusion models, enabling flexible granularity adjustments without model changes or retraining.
SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE
·2778 words·14 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Nanyang Technological University
SAR3D: Blazing-fast autoregressive 3D object generation and understanding using a multi-scale VQVAE, achieving sub-second generation and detailed multimodal comprehension.
Novel View Extrapolation with Video Diffusion Priors
·2381 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
3D Vision
🏢 Nanyang Technological University
ViewExtrapolator leverages Stable Video Diffusion to realistically extrapolate novel views far beyond training data, dramatically improving the quality of 3D scene generation.
VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
·3966 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Nanyang Technological University
VBench++: A new benchmark suite meticulously evaluates video generative models across 16 diverse dimensions, aligning with human perception for improved model development and fairer comparisons.