Skip to main content

🏢 UC Santa Barbara

RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios
·3495 words·17 mins· loading · loading
AI Generated 🤗 Daily Papers Natural Language Processing Large Language Models 🏢 UC Santa Barbara
RULEARENA, a new benchmark, rigorously evaluates large language models’ ability to apply complex, real-world rules across diverse scenarios, revealing significant shortcomings in current LLMs’ rule-gu…
Game-theoretic LLM: Agent Workflow for Negotiation Games
·4966 words·24 mins· loading · loading
AI Generated 🤗 Daily Papers AI Theory Optimization 🏢 UC Santa Barbara
Game-theoretic LLMs: Agent Workflow for Negotiation Games enhances large language model (LLM) rationality in strategic decision-making through novel game-theoretic workflows.