We present Seq-DeepIPC, a sequential end-to-end perception-to-control model for legged robot navigation in realworld environments. Seq-DeepIPC advances intelligent sensing for autonomous legged navigation by tightly integrating multi-modal perception (RGB-D + GNSS) with temporal fusion and control. The model jointly predicts semantic segmentation and depth estimation, giving richer spatial features for planning and control. For efficient deployment on edge devices, we use EfficientNet-B0 as the encoder, reducing computation while maintaining accuracy. Heading estimation is simplified by removing the noisy IMU and instead computing the bearing angle directly from consecutive GNSS positions. We collected a larger and more diverse dataset that includes both road and grass terrains, and validated Seq-DeepIPC on a robot dog. Comparative and ablation studies show that sequential inputs improve perception and control in our models, while other baselines do not benefit. Seq-DeepIPC achieves competitive or better results with reasonable model size; although GNSS-only heading is less reliable near tall buildings, it is robust in open areas. Overall, Seq-DeepIPC extends end-to-end navigation beyond wheeled robots to more versatile and temporally-aware systems. To support future research, we will release the codes to our GitHub repository at https://github.com/oskarnatan/Seq-DeepIPC.
翻译:本文提出Seq-DeepIPC,一种面向现实世界环境中腿式机器人导航的序列化端到端感知-控制模型。该模型通过将多模态感知(RGB-D + GNSS)与时间融合及控制紧密集成,推进了自主腿式导航的智能感知能力。模型联合预测语义分割与深度估计,为规划与控制提供更丰富的空间特征。为实现在边缘设备上的高效部署,我们采用EfficientNet-B0作为编码器,在保持精度的同时降低计算量。通过去除噪声较大的IMU并直接根据连续GNSS位置计算方位角,简化了航向估计过程。我们采集了规模更大、覆盖道路与草地等多种地形的数据集,并在机器狗平台上验证了Seq-DeepIPC的性能。对比实验与消融研究表明,序列化输入能有效提升本模型的感知与控制性能,而其他基线模型则未显现相同优势。Seq-DeepIPC以合理的模型规模取得了具有竞争力或更优的结果;尽管仅依赖GNSS的航向估计在高楼附近可靠性较低,但在开阔区域表现稳健。总体而言,Seq-DeepIPC将端到端导航的研究范畴从轮式机器人扩展至更具适应性且具备时间感知能力的系统。为支持后续研究,我们将在GitHub仓库(https://github.com/oskarnatan/Seq-DeepIPC)公开相关代码。