Skip to main content

🏢 Yonsei University

Latent Space Super-Resolution for Higher-Resolution Image Generation with Diffusion Models
·1777 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 Yonsei University
LSRNA: Super-resolution in latent space enhances image generation with diffusion models, achieving faster speeds and improved detail.
AnyAnomaly: Zero-Shot Customizable Video Anomaly Detection with LVLM
·4656 words·22 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 Yonsei University
AnyAnomaly: LVLM for customizable zero-shot video anomaly detection, adapting to diverse environments without retraining.
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning
·9576 words·45 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 Yonsei University
Test-time scaling isn’t a universal solve-all for multilingual math reasoning, unlike pre-training scaling, shows MCLM benchmark.
DisCoRD: Discrete Tokens to Continuous Motion via Rectified Flow Decoding
·5447 words·26 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Action Recognition 🏢 Yonsei University
DisCoRD: Rectified flow decodes discrete motion tokens into continuous, natural movement, balancing faithfulness and realism.
MaskRIS: Semantic Distortion-aware Data Augmentation for Referring Image Segmentation
·4014 words·19 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 Yonsei University
MaskRIS revolutionizes referring image segmentation by using novel masking and contextual learning to enhance data augmentation, achieving state-of-the-art results.