Markov games (MGs) and multi-agent reinforcement learning (MARL) are studied to model decision making in multi-agent systems. Traditionally, the objective in MG and MARL has been risk-neutral, i.e., agents are assumed to optimize a performance metric such as expected return, without taking into account subjective or cognitive preferences of themselves or of other agents. However, ignoring such preferences leads to inaccurate models of decision making in many real-world scenarios in finance, operations research, and behavioral economics. Therefore, when these preferences are present, it is necessary to incorporate a suitable measure of risk into the optimization objective of agents, which opens the door to risk-sensitive MG and MARL. In this paper, we systemically review the literature on risk sensitivity in MG and MARL that has been growing in recent years alongside other areas of reinforcement learning and game theory. We define and mathematically describe different risk measures used in MG and MARL and individually for each measure, discuss articles that incorporate it. Finally, we identify recent trends in theoretical and applied works in the field and discuss possible directions of future research.
翻译:马尔可夫博弈(MGs)与多智能体强化学习(MARL)被用于研究多智能体系统中的决策建模。传统上,MG与MARL的目标一直是风险中性的,即假定智能体优化诸如期望回报等性能指标,而未考虑自身或其他智能体的主观或认知偏好。然而,忽略此类偏好会导致在金融、运筹学和行为经济学等众多现实场景中的决策建模不准确。因此,当存在这些偏好时,有必要将合适的风险度量纳入智能体的优化目标中,这开启了风险敏感型MG与MARL的研究方向。本文系统性地综述了近年来随着强化学习与博弈论其他领域一同发展的、关于MG与MARL中风险敏感性的文献。我们定义并数学化描述了MG与MARL中使用的不同风险度量,并针对每一种度量分别讨论了采纳该度量的相关文献。最后,我们指出了该领域理论与应用研究的最新趋势,并探讨了未来可能的研究方向。