We study inverse mechanism learning: recovering an unknown incentive-generating mechanism from observed strategic interaction traces of self-interested learning agents. Unlike inverse game theory and multi-agent inverse reinforcement learning, which typically infer utility/reward parameters inside a structured mechanism, our target includes unstructured mechanism -- a (possibly neural) mapping from joint actions to per-agent payoffs. Unlike differentiable mechanism design, which optimizes mechanisms forward, we infer mechanisms from behavior in an observational setting. We propose DIML, a likelihood-based framework that differentiates through a model of multi-agent learning dynamics and uses the candidate mechanism to generate counterfactual payoffs needed to predict observed actions. We establish identifiability of payoff differences under a conditional logit response model and prove statistical consistency of maximum likelihood estimation under standard regularity conditions. We evaluate DIML with simulated interactions of learning agents across unstructured neural mechanisms, congestion tolling, public goods subsidies, and large-scale anonymous games. DIML reliably recovers identifiable incentive differences and supports counterfactual prediction, where its performance rivals tabular enumeration oracle in small environments and its convergence scales to large, hundred-participant environments. Code to reproduce our experiments is open-sourced.
翻译:我们研究逆机制学习问题:从自利学习智能体的观察到的策略交互轨迹中恢复未知的激励生成机制。与通常推断结构化机制内部效用/奖励参数的逆博弈论和多智能体逆强化学习不同,我们的目标包括非结构化机制——从联合动作到个体收益的(可能基于神经网络的)映射关系。与通过前向优化机制的可微分机制设计不同,我们在观测设置中从行为推断机制。我们提出DIML,一种基于似然估计的框架,该框架通过对多智能体学习动力学模型进行微分,并利用候选机制生成预测观测动作所需的反事实收益。我们在条件逻辑响应模型下建立了收益差异的可识别性,并在标准正则条件下证明了最大似然估计的统计一致性。我们通过模拟学习智能体在非结构化神经机制、拥堵收费、公共物品补贴和大规模匿名博弈中的交互来评估DIML。DIML能够可靠地恢复可识别的激励差异,并支持反事实预测——在小型环境中其性能可与表格枚举预言机相媲美,其收敛性可扩展至具有数百参与者的大型环境。实验复现代码已开源。