Although logit quantal response equilibrium (logit QRE) offers a natural equilibrium selection mechanism and converges to Nash equilibrium as the rationality parameter tends to infinity, its computation in extensive-form games is generally intractable when based on the normal-form representation, due to the exponential growth of the strategy space. To address this difficulty, this paper develops a sequence-form formulation of logit QRE for finite n-player extensive-form games with perfect recall, which avoids explicit construction of the normal form and enables compact computation. Based on this formulation, we further develop a differentiable path-following method starting from an arbitrary initial point, such that each point on the path corresponds to a logit QRE associated with a particular value of the rationality parameter, and the limiting point yields a Nash equilibrium. In this way, the proposed method provides an efficient computational framework for exploiting the equilibrium selection property of logit QRE in extensive-form games. The effectiveness of the proposed method is validated by theoretical analysis and numerical experiments.
翻译:虽然对数线性量化反应均衡(logit QRE)提供了一种自然的均衡选择机制,且当理性参数趋于无穷时收敛于纳什均衡,但在扩展型博弈中,基于正规形式表示的计算通常因其策略空间呈指数增长而难以处理。为解决这一难题,本文针对具有完美记忆的有限n人扩展型博弈,发展了一种基于序列形式的logit QRE表述,该表述避免了正规形式的显式构造,并实现了紧凑计算。基于这一表述,我们进一步提出了一种从任意初始点出发的可微路径追踪方法,使得路径上的每个点对应特定理性参数值下的logit QRE,而极限点则产生一个纳什均衡。由此,所提方法为在扩展型博弈中利用logit QRE的均衡选择特性提供了高效的计算框架。理论分析与数值实验验证了该方法的有效性。