It is essential for autonomous robots to be socially compliant while navigating in human-populated environments. Machine Learning and, especially, Deep Reinforcement Learning have recently gained considerable traction in the field of Social Navigation. This can be partially attributed to the resulting policies not being bound by human limitations in terms of code complexity or the number of variables that are handled. Unfortunately, the lack of safety guarantees and the large data requirements by DRL algorithms make learning in the real world unfeasible. To bridge this gap, simulation environments are frequently used. We propose SocNavGym, an advanced simulation environment for social navigation that can generate a wide variety of social navigation scenarios and facilitates the development of intelligent social agents. SocNavGym is light-weight, fast, easy-to-use, and can be effortlessly configured to generate different types of social navigation scenarios. It can also be configured to work with different hand-crafted and data-driven social reward signals and to yield a variety of evaluation metrics to benchmark agents' performance. Further, we also provide a case study where a Dueling-DQN agent is trained to learn social-navigation policies using SocNavGym. The results provides evidence that SocNavGym can be used to train an agent from scratch to navigate in simple as well as complex social scenarios. Our experiments also show that the agents trained using the data-driven reward function displays more advanced social compliance in comparison to the heuristic-based reward function.
翻译:自主机器人在人类环境中导航时需具备社交合规性。机器学习,尤其是深度强化学习,近年来在社交导航领域获得了显著关注,这在一定程度上归因于其生成的策略不受人类在代码复杂度或处理变量数量方面的限制。然而,深度强化学习算法缺乏安全性保障且需要大量数据,使得在真实环境中进行学习不可行。为弥合这一差距,仿真环境被广泛采用。我们提出SocNavGym——一种先进的社交导航仿真环境,可生成多样化的社交导航场景,并促进智能社交体开发。SocNavGym轻量、快速、易用,可灵活配置以生成不同类型的社交导航场景,并能适配人工设计或数据驱动的社交奖励信号,同时生成多种评估指标以衡量智能体性能。此外,我们通过案例研究展示了利用SocNavGym训练Dueling-DQN智能体学习社交导航策略的过程。结果表明,SocNavGym可用于从零开始训练智能体,使其在简单及复杂社交场景中完成导航。实验还表明,与基于启发式奖励函数的智能体相比,采用数据驱动奖励函数训练的智能体展现出更优的社交合规性。