Many autonomous agents, such as intelligent vehicles, are inherently required to interact with one another. Game theory provides a natural mathematical tool for robot motion planning in such interactive settings. However, tractable algorithms for such problems usually rely on a strong assumption, namely that the objectives of all players in the scene are known. To make such tools applicable for ego-centric planning with only local information, we propose an adaptive model-predictive game solver, which jointly infers other players' objectives online and computes a corresponding generalized Nash equilibrium (GNE) strategy. The adaptivity of our approach is enabled by a differentiable trajectory game solver whose gradient signal is used for maximum likelihood estimation (MLE) of opponents' objectives. This differentiability of our pipeline facilitates direct integration with other differentiable elements, such as neural networks (NNs). Furthermore, in contrast to existing solvers for cost inference in games, our method handles not only partial state observations but also general inequality constraints. In two simulated traffic scenarios, we find superior performance of our approach over both existing game-theoretic methods and non-game-theoretic model-predictive control (MPC) approaches. We also demonstrate our approach's real-time planning capabilities and robustness in two hardware experiments.
翻译:众多自主智能体(如智能车辆)本质上需要相互交互。博弈论为此类交互场景下的机器人运动规划提供了自然的数学工具。然而,此类问题的可计算算法通常依赖一个强假设:场景中所有参与者的目标已知。为使此类工具适用于仅依赖局部信息的自我中心规划,我们提出一种自适应模型预测博弈求解器,该求解器能在线联合推断其他参与者的目标,并计算相应的广义纳什均衡策略。该方法的自适应性源于可微轨迹博弈求解器,其梯度信号用于对手目标的最大似然估计。管线的可微性使其能与其他可微组件(如神经网络)直接集成。与现有博弈代价推断求解器相比,本方法不仅处理部分状态观测,还支持一般不等式约束。在两个模拟交通场景中,我们的方法在性能上优于现有博弈论方法及非博弈论模型预测控制方法。我们还通过两项硬件实验验证了该方法实时规划的能力与鲁棒性。