🏢 National Key Laboratory for Novel Software Technology, Nanjing University
CapArena: Benchmarking and Analyzing Detailed Image Captioning in the LLM Era
·4997 words·24 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
Multimodal Learning
Vision-Language Models
🏢 National Key Laboratory for Novel Software Technology, Nanjing University
CapArena: Detailed image caption benchmark in the LLM era, revealing metric biases and advancing automated evaluation.