Game-theoretic interactions with AI agents could differ from traditional human-human interactions in various ways. One such difference is that it may be possible to simulate an AI agent (for example because its source code is known), which allows others to accurately predict the agent's actions. This could lower the bar for trust and cooperation. In this paper, we formalize games in which one player can simulate another at a cost. We first derive some basic properties of such games and then prove a number of results for them, including: (1) introducing simulation into generic-payoff normal-form games makes them easier to solve; (2) if the only obstacle to cooperation is a lack of trust in the possibly-simulated agent, simulation enables equilibria that improve the outcome for both agents; and however (3) there are settings where introducing simulation results in strictly worse outcomes for both players.
翻译:人工智能体间的博弈互动可能在多个方面不同于传统的人类互动。其中一项差异在于,模拟AI智能体(例如因其源代码已知)成为可能,这使得他人能准确预测该智能体的行动,从而降低信任与合作的门槛。本文对一种博弈进行形式化建模——其中一方玩家可在付出成本后模拟另一方。我们首先推导此类博弈的基本性质,进而证明一系列结论,包括:(1) 在通用收益标准式博弈中引入模拟,会降低求解难度;(2) 若阻碍合作的唯一障碍是对可能被模拟智能体的信任缺失,则模拟机制能实现令双方收益改善的均衡;(3) 然而,在某些设定下,引入模拟会导致双方玩家的收益严格劣化。