Catching high-speed targets in the flight is a complex and typical highly dynamic task. In this paper, we propose Catch Planner, a planning-with-decision scheme for catching. For sequential decision making, we propose a policy search method based on deep reinforcement learning. In order to make catching adaptive and flexible, we propose a trajectory optimization method to jointly optimize the highly coupled catching time and terminal state while considering the dynamic feasibility and safety. We also propose a flexible constraint transcription method to catch targets at any reasonable attitude and terminal position bias. The proposed Catch Planner provides a new paradigm for the combination of learning and planning and is integrated on the quadrotor designed by ourselves, which runs at 100hz on the onboard computer. Extensive experiments are carried out in real and simulated scenes to verify the robustness of the proposed method and its expansibility when facing a variety of high-speed flying targets.
翻译:飞行中捕捉高速目标是一项复杂且典型的高动态任务。本文提出Catch Planner,一种面向捕捉的“规划-决策”方案。针对序列决策,我们提出基于深度强化学习的策略搜索方法。为提升捕捉的自适应性与灵活性,我们提出一种轨迹优化方法,在考虑动力学可行性与安全性的同时,联合优化高度耦合的捕捉时间与终端状态。此外,我们提出一种灵活约束转换方法,使系统可在任意合理姿态与终端位置偏差下捕捉目标。所提出的Catch Planner为学习与规划的结合提供了新范式,并集成于我们自主设计的四旋翼飞行器上,在机载计算机上以100Hz频率运行。我们在真实场景与仿真场景中开展了大量实验,验证了所提方法应对多种高速飞行目标时的鲁棒性与可扩展性。