The process of designing costmaps for off-road driving tasks is often a challenging and engineering-intensive task. Recent work in costmap design for off-road driving focuses on training deep neural networks to predict costmaps from sensory observations using corpora of expert driving data. However, such approaches are generally subject to over-confident mispredictions and are rarely evaluated in-the-loop on physical hardware. We present an inverse reinforcement learning-based method of efficiently training deep cost functions that are uncertainty-aware. We do so by leveraging recent advances in highly parallel model-predictive control and robotic risk estimation. In addition to demonstrating improvement at reproducing expert trajectories, we also evaluate the efficacy of these methods in challenging off-road navigation scenarios. We observe that our method significantly outperforms a geometric baseline, resulting in 44% improvement in expert path reconstruction and 57% fewer interventions in practice. We also observe that varying the risk tolerance of the vehicle results in qualitatively different navigation behaviors, especially with respect to higher-risk scenarios such as slopes and tall grass.
翻译:设计用于越野驾驶任务的代价地图通常是一项具有挑战性且工程密集的任务。近期越野驾驶代价地图设计的研究集中于利用专家驾驶数据语料库,通过深度神经网络从传感器观测中预测代价地图。然而,此类方法通常容易产生过度自信的错误预测,且很少在实物硬件上进行闭环评估。我们提出一种基于逆强化学习的高效训练方法,用于学习具有不确定性感知的深度代价函数。为此,我们利用高度并行的模型预测控制与机器人风险估计领域的最新进展。除了展示在复现专家轨迹方面的改进,我们还在具有挑战性的越野导航场景中评估了这些方法的有效性。实验表明,我们的方法显著优于几何基线方法,专家路径重建性能提升44%,实际干预次数减少57%。我们还观察到,改变车辆的风险容忍度会导致导航行为出现定性差异,尤其是在斜坡和高草丛等高风险场景中。