↓Skip to main content

🏢 UNC Chapel Hill

SELMA: Learning and Merging Skill-Specific Text-to-Image Experts with Auto-Generated Data

26 September 2024·2539 words·12 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 UNC Chapel Hill

SELMA boosts text-to-image fidelity by merging skill-specific models trained on automatically generated image-text datasets.

LACIE: Listener-Aware Finetuning for Calibration in Large Language Models

26 September 2024·2396 words·12 mins· loading · loading

Natural Language Processing Large Language Models 🏢 UNC Chapel Hill

LACIE: Listener-aware finetuning improves LLM confidence calibration, reducing incorrect answers accepted by human listeners by 47% while maintaining correct answer acceptance.

Calibrated Self-Rewarding Vision Language Models

26 September 2024·2260 words·11 mins· loading · loading

Multimodal Learning Vision-Language Models 🏢 UNC Chapel Hill

Calibrated Self-Rewarding (CSR) significantly improves vision-language models by using a novel iterative approach that incorporates visual constraints into the self-rewarding process, reducing halluci…