While imitation learning (IL) has enabled successful visual navigation in many common environments, IL policies are prone to unpredictable failures under out-of-distribution (OOD) scenarios. This necessitates failure-resilient policies, which not only detect failures, but also recognise their sources and recover from them autonomously. We propose InFeR, a general framework for building IL policies with informed failure resilience without failure or recovery demonstrations. InFeR retrains an IL policy with a Variational Information Bottleneck (VIB) loss to structure its latent space for OOD failure detection. It applies a visual explainability technique, Grad-CAM, to localise an image region as the source of failure and inform a heuristic policy for recovery. All these are achieved without requiring additional training data. Real-world experiments show that InFeR enables informed failure recovery across two different policy architectures, yielding robust long-range navigation in complex environments.
翻译:尽管模仿学习(IL)已在许多常见环境中实现了成功的视觉导航,但IL策略在分布外场景下容易出现不可预测的故障。这需要具备故障弹性的策略,该策略不仅能检测故障,还能识别其来源并自主恢复。本文提出了InFeR,一个无需故障或恢复演示即可构建具有信息驱动故障弹性的IL策略的通用框架。InFeR通过变分信息瓶颈损失重新训练IL策略,以结构化其潜在空间用于分布外故障检测;并应用视觉可解释性技术Grad-CAM定位图像区域作为故障源,为启发式恢复策略提供信息。所有功能均无需额外训练数据。真实世界实验表明,InFeR能够在两种不同策略架构上实现信息驱动的故障恢复,从而在复杂环境中实现鲁棒的长距离导航。