Systems of competing agents can often be modeled as games. Assuming rationality, the most likely outcomes are given by an equilibrium (e.g. a Nash equilibrium). In many practical settings, games are influenced by context, i.e. additional data beyond the control of any agent (e.g. weather for traffic and fiscal policy for market economies). Often the exact game mechanics are unknown, yet vast amounts of historical data consisting of (context, equilibrium) pairs are available, raising the possibility of learning a solver which predicts the equilibria given only the context. We introduce Nash Fixed Point Networks (N-FPNs), a class of neural networks that naturally output equilibria. Crucially, N- FPNs employ a constraint decoupling scheme to handle complicated agent action sets while avoiding expensive projections. Empirically, we find N-FPNs are compatible with the recently developed Jacobian-Free Backpropagation technique for training implicit networks, making them significantly faster and easier to train than prior models. Our experiments show N-FPNs are capable of scaling to problems orders of magnitude larger than existing learned game solvers.
翻译:竞争性agent系统通常可建模为博弈。假设理性条件,最可能的结果由均衡(如纳什均衡)给出。在许多实际场景中,博弈受情境影响,即超出任何agent控制范围的额外数据(例如交通中的天气因素,以及市场经济中的财政政策)。通常博弈的具体机制未知,但大量由(情境,均衡)对构成的历史数据可供使用,这催生了学习仅根据情境预测均衡的求解器的可能性。我们提出纳什不动点网络(N-FPN),这是一类天然输出均衡的神经网络。关键在于,N-FPN采用约束解耦机制处理复杂的agent动作集,同时避免昂贵的投影运算。实验发现,N-FPN与近期发展的用于训练隐式网络的雅可比自由反向传播技术兼容,使其训练速度与简易性显著优于先前模型。我们的实验表明,N-FPN能够扩展解决比现有学习型博弈求解器规模大数个数量级的问题。