Dynamic game theory is an increasingly popular tool for modeling multi-agent, e.g. human-robot, interactions. Game-theoretic models presume that each agent wishes to minimize a private cost function that depends on others' actions. These games typically evolve over a fixed time horizon, specifying how far into the future each agent plans. In practical settings, however, decision-makers may vary in foresightedness, or how much they care about their current cost in relation to their past and future costs. We conjecture that quantifying and estimating each agent's foresightedness from online data will enable safer and more efficient interactions with other agents. To this end, we frame this inference problem as an inverse dynamic game. We consider a specific objective function parametrization that smoothly interpolates myopic and farsighted planning. Games of this form are readily transformed into parametric mixed complementarity problems; we exploit the directional differentiability of solutions to these problems with respect to their hidden parameters to solve for agents' foresightedness. We conduct three experiments: one with synthetically generated delivery robot motion, one with real-world data involving people walking, biking, and driving vehicles, and one using high-fidelity simulators. The results of these experiments demonstrate that explicitly inferring agents' foresightedness enables game-theoretic models to make 33% more accurate models for agents' behavior.
翻译:动态博弈理论已成为建模多智能体(例如人机)交互日益流行的工具。博弈论模型假设每个智能体都希望最小化一个依赖于其他智能体行动的私有成本函数。这些博弈通常在固定的时间范围内演化,该范围限定了每个智能体对未来进行规划的程度。然而在实际场景中,决策者的前瞻性可能存在差异,即他们对其当前成本相对于过去和未来成本的关注程度不同。我们推测,从在线数据中量化并估计每个智能体的前瞻性,将有助于实现与其他智能体更安全、更高效的交互。为此,我们将此推断问题构建为一个逆动态博弈。我们考虑一种特定的目标函数参数化方法,该方法能平滑地插值短视与远视规划。此类博弈可方便地转化为参数化混合互补问题;我们利用这些问题的解关于其隐藏参数的方向可微性,来求解智能体的前瞻性。我们进行了三项实验:一项使用合成生成的配送机器人运动数据,一项涉及行人、自行车和车辆行驶的真实世界数据,另一项使用高保真度模拟器。这些实验结果表明,显式推断智能体的前瞻性能使博弈论模型对智能体行为的建模准确度提高33%。