🏢 M-a-P
CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
·2433 words·12 mins·
loading
·
loading
AI Generated
🤗 Daily Papers
AI Theory
Robustness
🏢 M-a-P
CodeCriticBench: A new benchmark for holistic code critique by Large Language Models.