Inverse Game Theory (IGT) methods based on the entropy-regularized Quantal Response Equilibrium (QRE) offer a tractable approach for competitive settings, but critically assume the agents' rationality parameter (temperature $τ$) is known a priori. When $τ$ is unknown, a fundamental scale ambiguity emerges that couples $τ$ with the reward parameters ($θ$), making them statistically unidentifiable. We introduce Blind-IGT, the first statistical framework to jointly recover both $θ$ and $τ$ from observed behavior. We analyze this bilinear inverse problem and establish necessary and sufficient conditions for unique identification by introducing a normalization constraint that resolves the scale ambiguity. We propose an efficient Normalized Least Squares (NLS) estimator and prove it achieves the optimal $\mathcal{O}(N^{-1/2})$ convergence rate for joint parameter recovery. When strong identifiability conditions fail, we provide partial identification guarantees through confidence set construction. We extend our framework to Markov games and demonstrate optimal convergence rates with strong empirical performance even when transition dynamics are unknown.
翻译:基于熵正则化量化响应均衡(QRE)的逆博弈论(IGT)方法为竞争场景提供了一种可处理的途径,但其关键假设是智能体的理性参数(温度 $τ$)是事先已知的。当 $τ$ 未知时,会出现一个根本性的尺度模糊性问题,该问题将 $τ$ 与奖励参数($θ$)耦合在一起,导致它们在统计上不可识别。我们提出了 Blind-IGT,这是首个从观测行为中联合恢复 $θ$ 和 $τ$ 的统计框架。我们分析了这个双线性逆问题,并通过引入一个解决尺度模糊性的归一化约束,建立了唯一识别的必要与充分条件。我们提出了一种高效的归一化最小二乘(NLS)估计器,并证明其在联合参数恢复上达到了最优的 $\mathcal{O}(N^{-1/2})$ 收敛速率。当强可识别性条件不满足时,我们通过置信集构造提供了部分识别保证。我们将框架扩展到马尔可夫博弈,并证明了即使在转移动态未知的情况下,仍能实现最优收敛速率和强大的实证性能。