In recent times, an increasing number of researchers have been devoted to utilizing deep neural networks for end-to-end flight navigation. This approach has gained traction due to its ability to bridge the gap between perception and planning that exists in traditional methods, thereby eliminating delays between modules. However, the practice of replacing original modules with neural networks in a black-box manner diminishes the overall system's robustness and stability. It lacks principled explanations and often fails to consistently generate high-quality motion trajectories. Furthermore, such methods often struggle to rigorously account for the robot's kinematic constraints, resulting in the generation of trajectories that cannot be executed satisfactorily. In this work, we combine the advantages of traditional methods and neural networks by proposing an optimization-embedded neural network. This network can learn high-quality trajectories directly from visual inputs without the need of mapping, while ensuring dynamic feasibility. Here, the deep neural network is employed to directly extract environment safety regions from depth images. Subsequently, we employ a model-based approach to represent these regions as safety constraints in trajectory optimization. Leveraging the availability of highly efficient optimization algorithms, our method robustly converges to feasible and optimal solutions that satisfy various user-defined constraints. Moreover, we differentiate the optimization process, allowing it to be trained as a layer within the neural network. This approach facilitates the direct interaction between perception and planning, enabling the network to focus more on the spatial regions where optimal solutions exist. As a result, it further enhances the quality and stability of the generated trajectories.
翻译:近年来,越来越多的研究者致力于利用深度神经网络实现端到端飞行导航。该方法因能弥合传统方法中感知与规划之间的鸿沟、消除模块间延迟而备受关注。然而,以黑箱方式用神经网络替代原始模块的做法会削弱整体系统的鲁棒性与稳定性——其缺乏原理性解释,且往往难以持续生成高质量运动轨迹。此外,此类方法通常无法严谨考虑机器人的运动学约束,导致生成的轨迹无法被令人满意地执行。本文融合传统方法与神经网络的各自优势,提出一种优嵌型神经网络。该网络无需建图,可直接从视觉输入中学习高质量轨迹,同时确保动力学可行性。具体而言,深度神经网络首先被用于从深度图像中直接提取环境安全区域,随后采用基于模型的方法将这些区域表示为轨迹优化中的安全约束。借助高效优化算法的可用性,本方法能稳健收敛至满足各类用户自定义约束的可行最优解。此外,我们对优化过程进行微分处理,使其可作为神经网络中的层进行训练。这种设计促进了感知与规划的直接交互,使网络能更聚焦于最优解存在的空间区域,从而进一步提升生成轨迹的质量与稳定性。