GustPilot: A Hierarchical DRL-INDI Framework for Wind-Resilient Quadrotor Navigation

Wind disturbances remain a key barrier to reliable autonomous navigation for lightweight quadrotors, where the rapidly varying airflow can destabilize both planning and tracking. This paper introduces GustPilot, a hierarchical wind-resilient navigation stack in which a deep reinforcement learning (DRL) policy generates inertial-frame velocity reference for gate traversal. At the same time, a geometric Incremental Nonlinear Dynamic Inversion (INDI) controller provides low-level tracking with fast residual disturbance rejection. The INDI layer achieves this by providing incremental feedback on both specific linear acceleration and angular acceleration rate, using onboard sensor measurements to reject wind disturbances rapidly. Robustness is obtained through a two-level strategy, wind-aware planning learned via fan-jet domain randomization during training, and rapid execution-time disturbance rejection by the INDI tracking controller. We evaluate GustPilot in real flights on a 50g quad-copter platform against a DRL-PID baseline across four scenarios ranging from no-wind to fully dynamic conditions with a moving gate and a moving disturbance source. Despite being trained only in a minimal single-gate and single-fan setup, the policy generalizes to significantly more complex environments (up to six gates and four fans) without retraining. Across 80 experiments, DRL-INDI achieves a 94.7% versus 55.0% for DRL-PID as average Overall Success Rate (OSR), reduces tracking RMSE up to 50%, and sustains speeds up to 1.34 m/s under wind disturbances up to 3.5 m/s. These results demonstrate that combining DRL-based velocity planning with structured INDI disturbance rejection provides a practical and generalizable approach to wind-resilient autonomous flight navigation.

翻译：风扰动仍是轻量级四旋翼无人机实现可靠自主导航的关键障碍，快速变化的气流会同时破坏规划与跟踪的稳定性。本文提出GustPilot——一种分层式抗风导航架构，其中深度强化学习策略生成惯性系速度参考值以穿越门型障碍，同时几何增量非线性动态逆控制器通过快速残余扰动抑制实现底层跟踪。INDI层通过提供比线加速度和比角加速度率增量反馈实现该功能，利用机载传感器测量值快速抑制风扰动。鲁棒性通过双层策略实现：训练阶段通过风扇喷射域随机化学习风感知规划，以及INDI跟踪控制器在运行时的快速扰动抑制。我们在50克四旋翼平台上，将GustPilot与DRL-PID基线在四种场景（从无风到含移动门型障碍与移动扰动源的完全动态条件）进行真实飞行评估。尽管仅在单门单风扇最小化设置中训练，该策略无需重新训练即可泛化至显著更复杂环境（最多六门四风扇）。在80次实验中，DRL-INDI的平均整体成功率（OSR）达94.7%（对比DRL-PID的55.0%），跟踪均方根误差降低高达50%，并在3.5米/秒风扰动下维持1.34米/秒飞行速度。结果表明，将基于DRL的速度规划与结构化INDI扰动抑制相结合，可为抗风自主飞行导航提供实用且可泛化的解决方案。