Autonomous robots are often employed for data collection due to their efficiency and low labour costs. A key task in robotic data acquisition is planning paths through an initially unknown environment to collect observations given platform-specific resource constraints, such as limited battery life. Adaptive online path planning in 3D environments is challenging due to the large set of valid actions and the presence of unknown occlusions. To address these issues, we propose a novel deep reinforcement learning approach for adaptively replanning robot paths to map targets of interest in unknown 3D environments. A key aspect of our approach is a dynamically constructed graph that restricts planning actions local to the robot, allowing us to quickly react to newly discovered obstacles and targets of interest. For replanning, we propose a new reward function that balances between exploring the unknown environment and exploiting online-collected data about the targets of interest. Our experiments show that our method enables more efficient target detection compared to state-of-the-art learning and non-learning baselines. We also show the applicability of our approach for orchard monitoring using an unmanned aerial vehicle in a photorealistic simulator.
翻译:自主机器人因其高效性和低劳动力成本而常被用于数据采集。机器人数据采集的关键任务是在初始未知的环境中规划路径,以在平台特定资源约束(如有限电池寿命)下收集观测数据。三维环境中的自适应在线路径规划因动作空间庞大及未知遮挡的存在而具有挑战性。为解决这些问题,我们提出了一种新颖的深度强化学习方法,用于在未知三维环境中自适应地重新规划机器人路径,以测绘感兴趣的目标。该方法的核心是动态构建的图结构,该图将规划动作限制在机器人局部范围内,从而能够快速响应新发现的障碍物和感兴趣目标。在重规划过程中,我们提出了一种新的奖励函数,在探索未知环境与利用在线收集的感兴趣目标数据之间取得平衡。实验表明,与最先进的基于学习与非学习的基线方法相比,我们的方法实现了更高效的目标检测。我们还展示了该方法在基于无人机的果园监测场景(使用照片级真实感模拟器)中的适用性。