Autonomous drone racing requires the tight coupling of perception, planning, and control under extreme agility. However, recent approaches typically rely on precomputed spatial reference trajectories or explicit 6-DoF gate pose estimation, rendering them brittle to spatial perturbations, unmodeled track changes, and sensor noise. Conversely, end-to-end learning policies frequently overfit to specific track layouts and struggle with zero-shot generalization. To address these fundamental limitations, we propose a fully onboard, vision guided optimal control framework that enables reference-free agile flight through arbitrarily placed and oriented gates. Central to our approach is Gate-SDF, a novel, implicitly learned neural signed distance field. Gate-SDF directly processes raw, noisy depth images to predict a continuous spatial field that provides both collision repulsion and active geometric guidance toward the valid traversal area. We seamlessly integrate this representation into a sampling-based Model Predictive Path Integral (MPPI) controller. By fully exploiting GPU parallelism, the framework evaluates these continuous spatial constraints across thousands of simulated trajectory rollouts simultaneously in real time. Furthermore, our formulation inherently maintains spatial consistency, ensuring robust navigation even under severe visual occlusion during aggressive maneuvers. Extensive simulations and real-world experiments demonstrate that the proposed system achieves high-speed agile flight and successfully navigates unseen tracks subject to severe unmodeled gate displacements and orientation perturbations. Videos are available at https://zhaofangguo.github.io/vision_guided_mppi/
翻译:自主无人机竞速需要在极端敏捷性下实现感知、规划与控制的高度耦合。然而,现有方法通常依赖预计算的空间参考轨迹或显式的六自由度门框姿态估计,使其对空间扰动、未建模的赛道变化及传感器噪声极为敏感。相比之下,端到端学习策略常过度拟合特定赛道布局,难以实现零样本泛化。为克服这些根本性局限,我们提出一种完全机载的视觉引导最优控制框架,能够无需参考轨迹即可穿越任意位置与朝向的门框实现敏捷飞行。本方法的核心是Gate-SDF——一种新型隐式学习的神经符号距离场。Gate-SDF直接处理原始含噪深度图像,预测连续空间场,同时提供碰撞斥力和朝向有效穿越区域的主动几何引导。我们将该表征无缝集成至基于采样的模型预测路径积分(MPPI)控制器中。通过充分利用GPU并行计算,该框架可实时同步评估数千条模拟轨迹在连续空间约束下的表现。此外,我们的建模方法本质保持空间一致性,确保即使在剧烈机动导致视觉严重遮挡时仍能实现鲁棒导航。大量仿真与实物实验表明,所提系统能实现高速敏捷飞行,并在遭遇严重未建模门框位移与朝向扰动的情况下成功导航未经预见的赛道。演示视频详见 https://zhaofangguo.github.io/vision_guided_mppi/