In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other agents. However, the dominant paradigm in game theory (GT) assumes that agents are not affected by risk from other agents and only strive to maximise their expected utility. For example, in hybrid human-AI driving systems, it is necessary to limit large deviations in reward resulting from car crashes. Although there are equilibrium concepts in game theory that take into account risk aversion, they either assume that agents are risk-neutral with respect to the uncertainty caused by the actions of other agents, or they are not guaranteed to exist. We introduce a new GT-based Risk-Averse Equilibrium (RAE) that always produces a solution that minimises the potential variance in reward accounting for the strategy of other agents. Theoretically and empirically, we show RAE shares many properties with a Nash Equilibrium (NE), establishing convergence properties and generalising to risk-dominant NE in certain cases. To tackle large-scale problems, we extend RAE to the PSRO multi-agent reinforcement learning (MARL) framework. We empirically demonstrate the minimum reward variance benefits of RAE in matrix games with high-risk outcomes. Results on MARL experiments show RAE generalises to risk-dominant NE in a trust dilemma game and that it reduces instances of crashing by 7x in an autonomous driving setting versus the best performing baseline.
翻译:为确保多智能体系统(MAS)中的智能体具备安全性,它们需要考量其他智能体行为所引发的风险。然而,博弈论(GT)的主流范式假设智能体不受其他智能体风险的影响,仅致力于最大化自身期望效用。例如,在混合人机驾驶系统中,必须限制由车祸导致的巨大收益偏差。尽管博弈论中存在考虑风险规避的均衡概念,但这些概念要么假设智能体对其他智能体行为引发的不确定性持风险中性态度,要么其存在性无法得到保证。本文提出一种基于博弈论的新型风险规避均衡(RAE),该均衡始终能产生一种解,能最小化考虑其他智能体策略时收益的潜在方差。从理论与实证角度,我们证明RAE与纳什均衡(NE)共享诸多特性,包括建立收敛性质,并在特定情况下推广至风险主导NE。针对大规模问题,我们将RAE扩展至PSRO多智能体强化学习(MARL)框架。我们通过具有高风险结果的矩阵博弈,实证展示了RAE在最小化收益方差方面的优势。在MARL实验中的结果表明,RAE在信任困境博弈中可推广至风险主导NE,且在自动驾驶场景中,与最优基线相比,该框架能将碰撞事件减少7倍。