Modern autonomous navigation systems predominantly rely on lidar and depth cameras. However, a fundamental question remains: Can flying robots navigate in clutter using solely monocular RGB images? Given the prohibitive costs of real-world data collection, learning policies in simulation offers a promising path. Yet, deploying such policies directly in the physical world is hindered by the significant sim-to-real perception gap. Thus, we propose a framework that couples the photorealism of 3D Gaussian Splatting (3DGS) environments with Adversarial Domain Adaptation. By training in high-fidelity simulation while explicitly minimizing feature discrepancy, our method ensures the policy relies on domain-invariant cues. Experimental results demonstrate that our policy achieves robust zero-shot transfer to the physical world, enabling safe and agile flight in unstructured environments with varying illumination.
翻译:现代自主导航系统主要依赖于激光雷达和深度相机。然而,一个根本性问题依然存在:飞行机器人能否仅使用单目RGB图像在杂乱环境中导航?鉴于现实世界数据采集的高昂成本,在仿真中学习策略提供了一条可行的途径。然而,直接将此类策略部署到物理世界会受到显著的仿真到现实感知差距的阻碍。因此,我们提出了一个框架,该框架将三维高斯泼溅(3DGS)环境的照片级真实感与对抗性域适应相结合。通过在高保真仿真中进行训练,同时显式地最小化特征差异,我们的方法确保了策略依赖于域不变特征。实验结果表明,我们的策略能够实现到物理世界的鲁棒零样本迁移,从而在光照条件多变的无结构环境中实现安全、敏捷的飞行。