The emergence of large language models (LLMs) has spurred economists to study how humans and LLMs behave in strategic settings. We organized a series of round-robin tournaments in the Colonel Blotto game. This game attracts game theorists' attention due to high-dimensional action space and the absence of pure strategy Nash equilibria. In the first tournament, more than 200 human participants competed against one another. In the second tournament, several popular LLMs were invited to submit strategies. In the third tournament, we matched the number of LLM strategies to the number submitted by humans. We find that humans more often employ better-calibrated intermediate-level allocation heuristics and outperform the simpler, more stereotyped strategies submitted by LLMs. Strategic sophistication is key to success if and only if the necessary level of reasoning depth is reached, while lower and higher levels of reasoning offer no clear advantage over the primitive strategies. Among humans, field of study weakly predicts success: participants with STEM backgrounds perform better in the first tournament. Surprisingly, humans almost do not adjust their strategies across tournaments with different sets of opponents. This result suggests that humans base their choices primarily on the game's rules rather than on the identity of their opponents, treating LLMs much like human competitors.
翻译:大型语言模型(LLMs)的兴起促使经济学家研究人类与LLMs在战略环境中的行为。我们组织了一系列Colonel Blotto博弈的循环赛。该博弈因高维动作空间及纯策略纳什均衡缺失而备受博弈论学者关注。在第一轮竞赛中,超过200名人类参与者相互竞争。第二轮竞赛邀请了几种主流LLMs提交策略。第三轮竞赛中,我们将LLMs策略数量匹配至人类提交策略数量。研究发现,人类更常采用校准更精确的中等层次分配启发式策略,其表现优于LLMs提交的更为简单、刻板的策略。只有当达到必要推理深度时,战略复杂性才是成功的关键;而较低或较高层次的推理相较于基础策略并无明显优势。在人类参与者中,学科背景对成功的影响较弱:具有STEM背景的参与者在第一轮竞赛中表现更佳。令人惊讶的是,人类在面对不同对手组别的竞赛中几乎未调整策略。这一结果表明,人类主要基于博弈规则而非对手身份做出选择,将LLMs视为与人类竞争对手类似的参与者。