Off-road autonomous navigation demands reliable 3D perception for robust obstacle detection in challenging unstructured terrain. While LiDAR is accurate, it is costly and power-intensive. Monocular depth estimation using foundation models offers a lightweight alternative, but its integration into outdoor navigation stacks remains underexplored. We present an open-source off-road navigation stack supporting both LiDAR and monocular 3D perception without task-specific training. For the monocular setup, we combine zero-shot depth prediction (Depth Anything V2) with metric depth rescaling using sparse SLAM measurements (VINS-Mono). Two key enhancements improve robustness: edge-masking to reduce obstacle hallucination and temporal smoothing to mitigate the impact of SLAM instability. The resulting point cloud is used to generate a robot-centric 2.5D elevation map for costmap-based planning. Evaluated in photorealistic simulations (Isaac Sim) and real-world unstructured environments, the monocular configuration matches high-resolution LiDAR performance in most scenarios, demonstrating that foundation-model-based monocular depth estimation is a viable LiDAR alternative for robust off-road navigation. By open-sourcing the navigation stack and the simulation environment, we provide a complete pipeline for off-road navigation as well as a reproducible benchmark. Code available at https://github.com/LARIAD/Offroad-Nav.
翻译:越野自主导航需要可靠的3D感知来实现对复杂非结构化地形中障碍物的稳健检测。激光雷达虽精度高,但成本昂贵且功耗巨大。基于基础模型的单目深度估计提供了一种轻量级替代方案,但其在户外导航系统中的集成尚未得到充分探索。本文提出一种支持激光雷达与单目3D感知的开源越野导航框架,无需进行任务特定训练。在单目配置中,我们将零样本深度预测(Depth Anything V2)与基于稀疏SLAM测量(VINS-Mono)的度量深度重缩放相结合。两项关键改进提升了鲁棒性:边缘掩码减少障碍物幻觉,时间平滑减轻SLAM不稳定性影响。生成的点云用于构建机器人中心2.5D高程图,支撑代价地图规划。在逼真模拟环境(Isaac Sim)与真实非结构化场景中的评估表明,单目配置在多数场景下可达到高分辨率激光雷达的性能水平,验证了基于基础模型的单目深度估计作为激光雷达替代方案的可行性。通过开源导航框架与模拟环境,我们提供了完整的越野导航管线及可复现基准测试。代码地址:https://github.com/LARIAD/Offroad-Nav。