-Navigation through narrow and irregular gaps is an essential skill in autonomous drones for applications such as inspection, search-and-rescue, and disaster response. However, traditional planning and control methods rely on explicit gap extraction and measurement, while recent end-to-end approaches often assume regularly shaped gaps, leading to poor generalization and limited practicality. In this work, we present a fully vision-based, end-to-end framework that maps depth images directly to control commands, enabling drones to traverse complex gaps within unseen environments. Operating in the Special Euclidean group SE(3), where position and orientation are tightly coupled, the framework leverages differentiable simulation, a Stop-Gradient operator, and a Bimodal Initialization Distribution to achieve stable traversal through consecutive gaps. Two auxiliary prediction modules-a gap-crossing success classifier and a traversability predictor-further enhance continuous navigation and safety. Extensive simulation and real-world experiments demonstrate the approach's effectiveness, generalization capability, and practical robustness.
翻译:摘要:穿越狭窄且不规则的间隙是自主无人机在巡检、搜救、灾害响应等应用中的关键技能。然而,传统规划与控制方法依赖于显式的间隙提取与测量,而近期端到端方法通常假设间隙形状规则,导致泛化能力差且实用性有限。本文提出一种完全基于视觉的端到端框架,将深度图像直接映射为控制指令,使无人机能够在未知环境中穿越复杂间隙。该框架在位置与姿态紧密耦合的特殊欧几里得群SE(3)中运行,通过可微仿真、梯度停止算子与双模态初始化分布实现连续间隙的稳定穿越。两个辅助预测模块——间隙穿越成功分类器与可通过性预测器——进一步增强了连续导航与安全性。大量仿真与真实世界实验验证了该方法的有效性、泛化能力与实际鲁棒性。