A variety of autonomous navigation algorithms exist that allow robots to move around in a safe and fast manner. However, many of these algorithms require parameter re-tuning when facing new environments. In this paper, we propose PTDRL, a parameter-tuning strategy that adaptively selects from a fixed set of parameters those that maximize the expected reward for a given navigation system. Our learning strategy can be used for different environments, different platforms, and different user preferences. Specifically, we attend to the problem of social navigation in indoor spaces, using a classical motion planning algorithm as our navigation system and training its parameters to optimize its behavior. Experimental results show that PTDRL can outperform other online parameter-tuning strategies.
翻译:现有多种自主导航算法可使机器人安全快速地移动。然而,当面对新环境时,许多算法需要重新调整参数。本文提出PTDRL——一种参数调优策略,该策略能从固定参数集中自适应选择能最大化给定导航系统预期奖励的参数。我们的学习策略可适用于不同环境、不同平台及不同用户偏好。具体而言,我们聚焦室内空间中的社交导航问题,采用经典运动规划算法作为导航系统,并通过训练其参数来优化系统行为。实验结果表明,PTDRL在性能上优于其他在线参数调优策略。