Catching high-speed targets in the flight is a complex and typical highly dynamic task. In this paper, we propose Catch Planner, a planning-with-decision scheme for catching. For sequential decision making, we propose a policy search method based on deep reinforcement learning. In order to make catching adaptive and flexible, we propose a trajectory optimization method to jointly optimize the highly coupled catching time and terminal state while considering the dynamic feasibility and safety. We also propose a flexible constraint transcription method to catch targets at any reasonable attitude and terminal position bias. The proposed Catch Planner provides a new paradigm for the combination of learning and planning and is integrated on the quadrotor designed by ourselves, which runs at 100$hz$ on the onboard computer. Extensive experiments are carried out in real and simulated scenes to verify the robustness of the proposed method and its expansibility when facing a variety of high-speed flying targets.
翻译:摘要:飞行中捕捉高速目标是一项复杂且典型的强动态任务。本文提出了一种名为Catch Planner的规划与决策联合方案。针对序列决策问题,我们提出了一种基于深度强化学习的策略搜索方法。为使捕捉过程具备自适应性及灵活性,我们提出了一种轨迹优化方法,该方法在考虑动态可行性与安全性的同时,联合优化高度耦合的捕捉时间与终端状态。此外,我们还提出了一种灵活的约束转化方法,使得捕捉可在任意合理姿态及终端位置偏差下进行。所提出的Catch Planner为学习与规划的结合提供了一种新范式,并集成于我们自主研发的四旋翼飞行器上,运行频率达100 Hz。在真实及仿真场景中进行了大量实验,以验证所提方法的鲁棒性及面对多种高速飞行目标时的可扩展性。