In recent times, an increasing number of researchers have been devoted to utilizing deep neural networks for end-to-end flight navigation. This approach has gained traction due to its ability to bridge the gap between perception and planning that exists in traditional methods, thereby eliminating delays between modules. However, the practice of replacing original modules with neural networks in a black-box manner diminishes the overall system's robustness and stability. It lacks principled explanations and often fails to consistently generate high-quality motion trajectories. Furthermore, such methods often struggle to rigorously account for the robot's kinematic constraints, resulting in the generation of trajectories that cannot be executed satisfactorily. In this work, we combine the advantages of traditional methods and neural networks by proposing an optimization-embedded neural network. This network can learn high-quality trajectories directly from visual inputs without the need of mapping, while ensuring dynamic feasibility. Here, the deep neural network is employed to directly extract environment safety regions from depth images. Subsequently, we employ a model-based approach to represent these regions as safety constraints in trajectory optimization. Leveraging the availability of highly efficient optimization algorithms, our method robustly converges to feasible and optimal solutions that satisfy various user-defined constraints. Moreover, we differentiate the optimization process, allowing it to be trained as a layer within the neural network. This approach facilitates the direct interaction between perception and planning, enabling the network to focus more on the spatial regions where optimal solutions exist. As a result, it further enhances the quality and stability of the generated trajectories.
翻译:近年来,越来越多的研究者致力于利用深度神经网络实现端到端的飞行导航。这种方法因其能够弥合传统方法中感知与规划之间的鸿沟而受到关注,从而消除了模块间的延迟。然而,以黑盒方式用神经网络替代原始模块的做法降低了整体系统的鲁棒性与稳定性。该方法缺乏原理性解释,且往往无法持续生成高质量的运动轨迹。此外,此类方法通常难以严格考虑机器人的运动学约束,导致生成的轨迹无法被满意地执行。在本工作中,我们结合传统方法与神经网络的优势,提出了一种优化嵌入神经网络。该网络能够直接从视觉输入中学习高质量轨迹,无需构建地图,同时确保动力学可行性。其中,深度神经网络被用于直接从深度图像中提取环境安全区域。随后,我们采用基于模型的方法将这些区域表示为轨迹优化中的安全约束。借助高效优化算法的可用性,我们的方法能够稳健地收敛到满足各类用户定义约束的可行最优解。此外,我们对优化过程进行微分,使其能够作为神经网络中的一个层进行训练。这种方法促进了感知与规划之间的直接交互,使网络能够更专注于最优解存在的空间区域。因此,它进一步提升了生成轨迹的质量与稳定性。