This paper presents a new contribution to the growing set of benchmarks used to prune potential AI designs. Much as one might evaluate a machine in terms of its performance at chess, this benchmark involves testing a machine in terms of its performance at a game called "Musical Chairs." At the time of writing, Claude, ChatGPT, and Qwen each failed this test, so the test could aid in their ongoing improvement. Furthermore, this paper sets a stage for future innovation in game theory and AI safety by providing an example of success with non-standard approaches to each: studying a game beyond the scope of previous game theoretic tools and mitigating a serious AI safety risk in a way that requires neither determination of values nor their enforcement.
翻译:本文为日益增长的用于筛选潜在人工智能设计的基准集提供了一项新贡献。该基准通过测试机器在名为"音乐椅"游戏中的表现来评估其能力,类似于通过国际象棋表现来评估机器。在撰写本文时,Claude、ChatGPT和Qwen均未通过该测试,因此该测试有助于这些系统的持续改进。此外,本文通过两个非标准方法的成功案例为博弈论和人工智能安全领域的未来创新奠定了基础:研究超出传统博弈论工具范畴的游戏,以及在不需确定或强制执行价值观的前提下缓解重大人工智能安全风险。