This paper presents Deep-PANTHER, a learning-based perception-aware trajectory planner for unmanned aerial vehicles (UAVs) in dynamic environments. Given the current state of the UAV, and the predicted trajectory and size of the obstacle, Deep-PANTHER generates multiple trajectories to avoid a dynamic obstacle while simultaneously maximizing its presence in the field of view (FOV) of the onboard camera. To obtain a computationally tractable real-time solution, imitation learning is leveraged to train a Deep-PANTHER policy using demonstrations provided by a multimodal optimization-based expert. Extensive simulations show replanning times that are two orders of magnitude faster than the optimization-based expert, while achieving a similar cost. By ensuring that each expert trajectory is assigned to one distinct student trajectory in the loss function, Deep-PANTHER can also capture the multimodality of the problem and achieve a mean squared error (MSE) loss with respect to the expert that is up to 18 times smaller than state-of-the-art (Relaxed) Winner-Takes-All approaches. Deep-PANTHER is also shown to generalize well to obstacle trajectories that differ from the ones used in training.
翻译:本文提出Deep-PANTHER,一种基于学习的感知感知轨迹规划器,专用于动态环境中的无人飞行器(UAV)。给定UAV当前状态以及障碍物的预测轨迹和尺寸,Deep-PANTHER生成多条轨迹以避开动态障碍物,同时最大化其在机载相机视场(FOV)中的存在时间。为获得计算上可实现的实时解,本文利用模仿学习训练Deep-PANTHER策略,该策略基于多模态优化专家提供的示范。大量仿真表明,其重新规划时间比基于优化的专家快两个数量级,同时实现相近的成本。通过在损失函数中确保每条专家轨迹唯一对应一条学生轨迹,Deep-PANTHER能够捕捉问题的多模态特性,并与专家相比实现均方误差(MSE)损失,该损失比现有最优的(宽松)赢家通吃方法低至18倍。此外,Deep-PANTHER在训练中未使用的障碍物轨迹上也表现出良好的泛化能力。