In order for agents in multi-agent systems (MAS) to be safe, they need to take into account the risks posed by the actions of other agents. However, the dominant paradigm in game theory (GT) assumes that agents are not affected by risk from other agents and only strive to maximise their expected utility. For example, in hybrid human-AI driving systems, it is necessary to limit large deviations in reward resulting from car crashes. Although there are equilibrium concepts in game theory that take into account risk aversion, they either assume that agents are risk-neutral with respect to the uncertainty caused by the actions of other agents, or they are not guaranteed to exist. We introduce a new GT-based Risk-Averse Equilibrium (RAE) that always produces a solution that minimises the potential variance in reward accounting for the strategy of other agents. Theoretically and empirically, we show RAE shares many properties with a Nash Equilibrium (NE), establishing convergence properties and generalising to risk-dominant NE in certain cases. To tackle large-scale problems, we extend RAE to the PSRO multi-agent reinforcement learning (MARL) framework. We empirically demonstrate the minimum reward variance benefits of RAE in matrix games with high-risk outcomes. Results on MARL experiments show RAE generalises to risk-dominant NE in a trust dilemma game and that it reduces instances of crashing by 7x in an autonomous driving setting versus the best performing baseline.
翻译:为确保多智能体系统(MAS)中的智能体安全运行,需要考量其他智能体行为所引发的风险。然而,博弈论(GT)的主流范式假定智能体不受其他智能体带来的风险影响,仅致力于最大化其期望效用。例如,在人机混合驾驶系统中,必须限制因交通事故导致的奖励大幅偏离。尽管博弈论中已有考虑风险厌恶的均衡概念,但这类概念要么假定智能体对他人行为造成的不确定性持风险中性态度,要么无法保证其存在性。我们提出一种基于博弈论的新型风险厌恶均衡(RAE),该均衡始终能产生最小化考虑其他智能体策略时潜在奖励方差的解。理论与实证表明,RAE与纳什均衡(NE)具有诸多共性,包括建立收敛性质,并在特定情形下可泛化为风险主导型NE。为应对大规模问题,我们将RAE扩展至PSRO多智能体强化学习(MARL)框架。通过矩阵博弈中高风险结果的实验,我们实证展示了RAE在最小化奖励方差方面的优势。MARL实验结果显示:在信任困境博弈中,RAE可泛化为风险主导型NE;在自动驾驶场景中,相比最优基线方法,RAE将碰撞事故发生率降低7倍。