We address the problem of finding mixed-strategy Nash equilibrium for crowd navigation. Mixed-strategy Nash equilibrium provides a rigorous model for the robot to anticipate uncertain yet cooperative human behavior in crowds, but the computation cost is often too high for scalable and real-time decision-making. Here we prove that a simple iterative Bayesian updating scheme converges to the Nash equilibrium of a mixed-strategy social navigation game. Furthermore, we propose a data-driven framework to construct the game by initializing agent strategies as Gaussian processes learned from human datasets. Based on the proposed mixed-strategy Nash equilibrium model, we develop a sampling-based crowd navigation framework that can be integrated into existing navigation methods and runs in real-time on a laptop CPU. We evaluate our framework in both simulated environments and real-world human datasets in unstructured environments. Our framework consistently outperforms both non-learning and learning-based methods on both safety and navigation efficiency and reaches human-level crowd navigation performance on top of a meta-planner.
翻译:我们研究了人群导航中混合策略纳什均衡的求解问题。混合策略纳什均衡为机器人预测人群中不确定但具有合作性的人类行为提供了严谨的模型,但其计算成本通常过高,难以实现可扩展的实时决策。本文证明了一种简单的迭代贝叶斯更新方法能够收敛到混合策略社交导航博弈的纳什均衡。此外,我们提出了一种数据驱动框架来构建该博弈,通过从人类数据集中学习高斯过程来初始化智能体策略。基于所提出的混合策略纳什均衡模型,我们开发了一种基于采样的群体导航框架,该框架可集成至现有导航方法,并在笔记本电脑CPU上实现实时运行。我们在非结构化环境中,通过仿真场景和真实世界人类数据集对框架进行了评估。结果表明,无论是在安全性还是导航效率方面,我们的框架均持续优于非学习型与基于学习的方法,并在元规划器基础上达到了人类水平的群体导航性能。