Skip to main content

🏢 KAIST

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation
·2259 words·11 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 KAIST
ORIGEN: First zero-shot 3D orientation grounding in text-to-image generation.
Unconditional Priors Matter! Improving Conditional Generation of Fine-Tuned Diffusion Models
·393 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 KAIST
Fixing fine-tuned diffusion models! By using richer, unconditional priors, they generate better images and videos.
Inference-Time Scaling for Flow Models via Stochastic Generation and Rollover Budget Forcing
·2020 words·10 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 KAIST
Inference-time scaling for flow models enhances alignment with user preferences via stochastic generation and budget allocation.
Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models
·410 words·2 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 KAIST
New ‘Silent Branding Attack’ poisons text-to-image models, embedding brand logos without text prompts, raising ethical issues for image generation tools.
Tuning-Free Multi-Event Long Video Generation via Synchronized Coupled Sampling
·3192 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 KAIST
SynCoS: Synchronized sampling generates high-quality & coherent long videos from text, without extra training!
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
·1708 words·9 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KAIST
Sketch-of-Thought(SoT) reduces LLM token usage by up to 76% while maintaining (or improving) accuracy via cognitive-inspired sketching.
Is Safety Standard Same for Everyone? User-Specific Safety Evaluation of Large Language Models
·5119 words·25 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Safety 🏢 KAIST
LLMs fail to act safely when considering user-specific safety standards, which were made to be solved via new benchmark.
SafeRoute: Adaptive Model Selection for Efficient and Accurate Safety Guardrails in Large Language Models
·2481 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 KAIST
SafeRoute efficiently enhances LLM safety by adaptively using smaller and larger safety guard models, maximizing accuracy while minimizing costs.
MIVE: New Design and Benchmark for Multi-Instance Video Editing
·7714 words·37 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 KAIST
Edit many objects at once in videos! MIVE does it accurately without affecting other areas, a big step for AI video editing.
Controllable Human Image Generation with Personalized Multi-Garments
·4062 words·20 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Image Generation 🏢 KAIST
BootComp: generate realistic human images wearing multiple garments using a novel synthetic data pipeline & diffusion model, enabling diverse applications like virtual try-on.
Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction
·2991 words·15 mins· loading · loading
AI Generated 🤗 Daily Papers Computer Vision Video Understanding 🏢 KAIST
CoordTok: a novel video tokenizer drastically reduces token count for long videos, enabling memory-efficient training of diffusion models for high-quality, long video generation.