Skip to main content

🏢 KAIST AI

Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions
·5687 words·27 mins· loading · loading
AI Generated 🤗 Daily Papers Multimodal Learning Vision-Language Models 🏢 KAIST AI
SIGHTATION: A BLV-aligned dataset utilizing sighted user feedback to enhance diagram descriptions generated by VLMs, improving accessibility for visually impaired learners.
Reangle-A-Video: 4D Video Generation as Video-to-Video Translation
·2533 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 KAIST AI
Reangle-A-Video generates synchronized multi-view videos from a single video via video-to-video translation, surpassing existing methods without specialized 4D training.