Training and transferring learning-based policies for quadrotors from simulation to reality remains challenging due to inefficient visual rendering, physical modeling inaccuracies, unmodeled sensor discrepancies, and the absence of a unified platform integrating differentiable physics learning into end-to-end training. While recent work has demonstrated various end-to-end quadrotor control tasks, few systems provide a systematic, zero-shot transfer pipeline, hindering reproducibility and real-world deployment. To bridge this gap, we introduce E2E-Fly, an integrated framework featuring an agile quadrotor platform coupled with a full-stack training, validation, and deployment workflow. The training framework incorporates a high-performance simulator with support for differentiable physics learning and reinforcement learning, alongside structured reward design tailored to common quadrotor tasks. We further introduce a two-stage validation strategy using sim-to-sim transfer and hardware-in-the-loop testing, and deploy policies onto two physical quadrotor platforms via a dedicated low-level control interface and a comprehensive sim-to-real alignment methodology, encompassing system identification, domain randomization, latency compensation, and noise modeling. To the best of our knowledge, this is the first work to systematically unify differentiable physical learning with training, validation, and real-world deployment for quadrotors. Finally, we demonstrate the effectiveness of our framework for training six end-to-end control tasks and deploy them in the real world.
翻译:由于低效的视觉渲染、物理建模不准确、未建模的传感器差异,以及缺乏将可微物理学习集成到端到端训练中的统一平台,训练并迁移基于学习的四旋翼策略从仿真到现实仍具挑战性。尽管近期工作展示了多种端到端四旋翼控制任务,但很少有系统提供系统化的零样本迁移管线,阻碍了可复现性与实际部署。为弥补这一差距,我们提出E2E-Fly——一个集成了敏捷四旋翼平台与全栈训练、验证及部署工作流的统一框架。训练框架包含高性能仿真器,支持可微物理学习与强化学习,并配备针对常见四旋翼任务设计的结构化奖励函数。我们进一步引入两阶段验证策略(仿真到仿真迁移与硬件在环测试),并通过专用底层控制接口及涵盖系统辨识、域随机化、延迟补偿与噪声建模的综合仿真到现实对齐方法,将策略部署至两个实体四旋翼平台。据我们所知,这是首个系统性地将可微物理学习与四旋翼训练、验证及实际部署相统一的工作。最后,我们展示该框架在训练六项端到端控制任务中的有效性,并完成实际环境部署。