Skip to main content

🏢 M-a-P

CodeCriticBench: A Holistic Code Critique Benchmark for Large Language Models
·2433 words·12 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Robustness 🏢 M-a-P
CodeCriticBench: A new benchmark for holistic code critique by Large Language Models.
Buy Me A Coffee