🏢 Drexel University
GTBench: Uncovering the Strategic Reasoning Capabilities of LLMs via Game-Theoretic Evaluations
·2898 words·14 mins·
loading
·
loading
Natural Language Processing
Large Language Models
🏢 Drexel University
GTBENCH reveals LLMs’ strategic reasoning weaknesses via game-theoretic evaluations, showing strengths in probabilistic scenarios but struggles with deterministic ones; code-pretraining helps.