🏢 Yonsei University
Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
·1777 words·9 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Image Generation
🏢 Yonsei University
LSRNA: Super-resolution in latent space enhances image generation with diffusion models, achieving faster speeds and improved detail.
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM
·4656 words·22 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Video Understanding
🏢 Yonsei University
AnyAnomaly: LVLM for customizable zero-shot video anomaly detection, adapting to diverse environments without retraining.
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
·9576 words·45 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Natural Language Processing
Large Language Models
🏢 Yonsei University
Test-time scaling isn’t a universal solve-all for multilingual math reasoning, unlike pre-training scaling, shows MCLM benchmark.
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding
·5447 words·26 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Computer Vision
Action Recognition
🏢 Yonsei University
DisCoRD: Rectified flow decodes discrete motion tokens into continuous, natural movement, balancing faithfulness and realism.
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
·4014 words·19 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 Yonsei University
MaskRIS revolutionizes referring image segmentation by using novel masking and contextual learning to enhance data augmentation, achieving state-of-the-art results.