Developing robotic technologies for use in human society requires ensuring the safety of robots' navigation behaviors while adhering to pedestrians' expectations and social norms. However, maintaining real-time communication between robots and pedestrians to avoid collisions can be challenging. To address these challenges, we propose a novel socially-aware navigation benchmark called NaviSTAR, which utilizes a hybrid Spatio-Temporal grAph tRansformer (STAR) to understand interactions in human-rich environments fusing potential crowd multi-modal information. We leverage off-policy reinforcement learning algorithm with preference learning to train a policy and a reward function network with supervisor guidance. Additionally, we design a social score function to evaluate the overall performance of social navigation. To compare, we train and test our algorithm and other state-of-the-art methods in both simulator and real-world scenarios independently. Our results show that NaviSTAR outperforms previous methods with outstanding performance\footnote{The source code and experiment videos of this work are available at: https://sites.google.com/view/san-navistar
翻译:为人类社会开发机器人技术需确保机器人导航行为的安全性,同时满足行人的预期与社会规范。然而,维持机器人与行人间的实时通信以避免碰撞具有挑战性。针对这些挑战,我们提出新型社交感知导航基准框架NaviSTAR,该框架利用混合时空图变换器(STAR)理解人类密集环境中的交互,融合潜在群体多模态信息。我们采用离策略强化学习算法结合偏好学习,在监督引导下训练策略网络与奖励函数网络。此外,设计社会性评分函数以评估社交导航的整体性能。为进行比较,我们在仿真器与真实场景中分别独立训练并测试本算法及其他当前最优方法。实验结果表明,NaviSTAR在性能上显著优于先前方法(注:本工作的源代码与实验视频见https://sites.google.com/view/san-navistar)。