Nash equilibrium serves as a fundamental mathematical tool in economics and game theory. However, it classically assumes knowledge of player utilities, whereas economics generally regards preferences as more fundamental. To leverage equilibrium analysis in strategic scenarios, one must first elicit numerical utilities consistent with player preferences, a delicate and time-consuming process. In this work, we forgo precise utilities and generalize the Nash equilibrium to a setting where we only assume a player is capable of providing an ordinal ranking of their actions within the context of other players' joint actions. The key technical challenge is to rethink the definition of a best-response. While the classical definition identifies actions maximizing expected payoff, we naturally look towards social choice theory for how to aggregate preferences to identify the most preferred actions. We define this generalized notion of a context-ordinal Nash equilibrium, establish its existence under mild conditions on aggregation methods, introduce notions of regularization, approximation, and regret, explore complexity for simple settings, and develop learning rules for computing such equilibria. In doing so, we provide a generalization of Nash equilibrium and demonstrate its direct applicability to elicited preferences in human experiments.
翻译:纳什均衡是经济学和博弈论中的基本数学工具。然而,它经典地假设了参与者的效用函数已知,而经济学通常认为偏好更为基础。为了在战略场景中利用均衡分析,必须首先获取与参与者偏好一致的数值效用,这是一个微妙且耗时的过程。在本工作中,我们放弃精确效用,将纳什均衡推广到仅假设参与者能在其他参与者联合行动的背景下对其自身行动进行序数排序的设置中。关键技术挑战在于重新定义最优反应的概念。经典定义识别最大化期望收益的行动,而我们自然地转向社会选择理论,探讨如何聚合偏好以识别最偏好的行动。我们定义了这种广义的上下文序纳什均衡概念,在聚合方法的温和条件下证明了其存在性,引入了正则化、近似和遗憾的概念,探索了简单设置下的复杂性,并开发了计算此类均衡的学习规则。通过此举,我们提供了纳什均衡的推广,并展示了其直接适用于人类实验中诱导偏好的场景。