When robots are deployed in the field for environmental monitoring they typically execute pre-programmed motions, such as lawnmower paths, instead of adaptive methods, such as informative path planning. One reason for this is that adaptive methods are dependent on parameter choices that are both critical to set correctly and difficult for the non-specialist to choose. Here, we show how to automatically configure a planner for informative path planning by training a reinforcement learning agent to select planner parameters at each iteration of informative path planning. We demonstrate our method with 37 instances of 3 distinct environments, and compare it against pure (end-to-end) reinforcement learning techniques, as well as approaches that do not use a learned model to change the planner parameters. Our method shows a 9.53% mean improvement in the cumulative reward across diverse environments when compared to end-to-end learning based methods; we also demonstrate via a field experiment how it can be readily used to facilitate high performance deployment of an information gathering robot.
翻译:当机器人被部署到野外环境监测时,通常执行预设的运动轨迹(如割草机路径),而非自适应方法(如信息路径规划)。原因之一在于自适应方法依赖参数选择,这些参数不仅关乎正确设定,对非专业人士而言也难以选取。本文展示了如何通过训练强化学习智能体在每次信息路径规划迭代中选择规划器参数,从而自动配置信息路径规划器。我们在3种不同环境的37个实例中验证了该方法,并与纯(端到端)强化学习技术及不采用学习模型调整规划器参数的方法进行了对比。实验表明,相较于基于端到端学习的方法,该方法在不同环境下的累积奖励平均提升9.53%;此外,通过野外实验证明,该方法可便捷地用于实现信息采集机器人的高性能部署。