This paper presents a novel learning-based trajectory planning framework for quadrotors that combines model-based optimization techniques with deep learning. Specifically, we formulate the trajectory optimization problem as a quadratic programming (QP) problem with dynamic and collision-free constraints using piecewise trajectory segments through safe flight corridors [1]. We train neural networks to directly learn the time allocation for each segment to generate optimal smooth and fast trajectories. Furthermore, the constrained optimization problem is applied as a separate implicit layer for back-propagating in the network, for which the differential loss function can be obtained. We introduce an additional penalty function to penalize time allocations which result in solutions that violate the constraints to accelerate the training process and increase the success rate of the original optimization problem. To this end, we enable a flexible number of sequences of piece-wise trajectories by adding an extra end-of-sentence token during training. We illustrate the performance of the proposed method via extensive simulation and experimentation and show that it works in real time in diverse, cluttered environments.
翻译:本文提出了一种新颖的基于学习的四旋翼轨迹规划框架,该框架将基于模型的优化技术与深度学习相结合。具体而言,我们利用安全飞行走廊[1]中的分段轨迹段,将轨迹优化问题形式化为一个具有动力学和无碰撞约束的二次规划(QP)问题。我们训练神经网络直接学习每个时间段的分配,以生成最优的平滑且快速的轨迹。此外,将约束优化问题作为一个独立的隐层应用于网络中用于反向传播,从而可以获得微分损失函数。我们引入一个额外的惩罚函数来惩罚导致违反约束解的时间分配,以加速训练过程并提高原始优化问题的成功率。为此,通过在训练过程中添加额外的结束标记,我们实现了灵活数量的分段轨迹序列。通过大量仿真和实验,我们展示了所提方法的性能,并证明其能在多样化、杂乱环境中实时工作。