We study the problem of safe and intention-aware robot navigation in dense and interactive crowds. Most previous reinforcement learning (RL) based methods fail to consider different types of interactions among all agents or ignore the intentions of people, which results in performance degradation. To learn a safe and efficient robot policy, we propose a novel recurrent graph neural network with attention mechanisms to capture heterogeneous interactions among agents through space and time. To encourage longsighted robot behaviors, we infer the intentions of dynamic agents by predicting their future trajectories for several timesteps. The predictions are incorporated into a model-free RL framework to prevent the robot from intruding into the intended paths of other agents. We demonstrate that our method enables the robot to achieve good navigation performance and non-invasiveness in challenging crowd navigation scenarios. We successfully transfer the policy learned in simulation to a real-world TurtleBot 2i. Our code and videos are available at https://sites.google.com/view/intention-aware-crowdnav/home.
翻译:我们研究在密集且具有交互性的人群中实现安全且意图感知的机器人导航问题。以往大多数基于强化学习的方法未能考虑所有智能体之间不同类型的交互,或忽略了人的意图,导致性能下降。为学习安全高效的机器人策略,我们提出一种新颖的循环图神经网络,结合注意力机制以捕捉智能体间在时空上的异质性交互。为鼓励机器人的长远决策行为,我们通过预测动态智能体未来多个时间步的轨迹来推断其意图。这些预测被整合到无模型强化学习框架中,防止机器人侵入其他智能体的预期路径。实验表明,该方法使机器人在具有挑战性的人群导航场景中实现了良好的导航性能与非侵入性。我们成功将仿真中习得的策略迁移至真实世界的TurtleBot 2i机器人。我们的代码和视频见 https://sites.google.com/view/intention-aware-crowdnav/home。