Autonomous fall recovery is a critical capability for quadrotors operating in real-world environments, where collisions or failures may leave the vehicle resting on the ground in an arbitrary attitude. This problem is challenging because recovery must be achieved under limited onboard sensing, in constrained free space, with ground contact, and in the presence of unknown disturbances. In this letter, we present an RL-based framework for autonomous fall recovery of a quadrotor from arbitrary ground attitudes to stable hover using only lightweight onboard sensors. To address severe partial observability and intermittent sensor invalidity, we train a recurrent policy within an asymmetric actor--critic architecture, leveraging an Incremental Nonlinear Dynamic Inversion (INDI) controller to track the policy output. Combined with high-fidelity simulations of motor response and optical flow, the overall training framework significantly reduces the sim-to-real gap. Simulation ablation studies validate the importance of the main design choices, while real-world experiments demonstrate zero-shot transfer and robust recovery under different initial attitudes, wind disturbances, and additional payloads. These results demonstrate that agile quadrotor fall recovery can be achieved without explicit state estimation using only limited and unreliable onboard sensing.
翻译:自主跌倒恢复是四旋翼无人机在现实环境中运行的关键能力——当碰撞或故障导致飞行器以任意姿态停在地面时,需实现自主姿态恢复。该问题具有挑战性,原因在于恢复过程需在有限机载感知、受限自由空间、地面接触及未知扰动条件下完成。本文提出一种基于强化学习(RL)的框架,仅利用轻量级机载传感器实现四旋翼无人机从任意地面姿态到稳定悬停的自主跌倒恢复。为解决严重的部分可观测性与传感器间歇性失效问题,我们在非对称Actor-Critic架构中训练循环策略,并利用增量非线性动态逆(INDI)控制器跟踪策略输出。结合电机响应与光流的高保真仿真,整体训练框架显著缩小了仿真与现实的差距。仿真消融研究验证了主要设计选择的重要性,而实体实验展示了在不同初始姿态、风扰动及额外载荷下零样本迁移与鲁棒恢复能力。结果表明,无需显式状态估计,仅依赖有限且不可靠的机载传感即可实现敏捷的四旋翼跌倒恢复。