Wait, That Feels Familiar: Learning to Extrapolate Human Preferences for Preference Aligned Path Planning

Autonomous mobility tasks such as lastmile delivery require reasoning about operator indicated preferences over terrains on which the robot should navigate to ensure both robot safety and mission success. However, coping with out of distribution data from novel terrains or appearance changes due to lighting variations remains a fundamental problem in visual terrain adaptive navigation. Existing solutions either require labor intensive manual data recollection and labeling or use handcoded reward functions that may not align with operator preferences. In this work, we posit that operator preferences for visually novel terrains, which the robot should adhere to, can often be extrapolated from established terrain references within the inertial, proprioceptive, and tactile domain. Leveraging this insight, we introduce Preference extrApolation for Terrain awarE Robot Navigation, PATERN, a novel framework for extrapolating operator terrain preferences for visual navigation. PATERN learns to map inertial, proprioceptive, tactile measurements from the robots observations to a representation space and performs nearest neighbor search in this space to estimate operator preferences over novel terrains. Through physical robot experiments in outdoor environments, we assess PATERNs capability to extrapolate preferences and generalize to novel terrains and challenging lighting conditions. Compared to baseline approaches, our findings indicate that PATERN robustly generalizes to diverse terrains and varied lighting conditions, while navigating in a preference aligned manner.

翻译：自主移动任务（如最后一英里配送）需要根据操作员对机器人应导航地形的偏好进行推理，以确保机器人安全与任务成功。然而，应对来自新地形的分布外数据或因光照变化导致的外观变化，仍是视觉自适应导航中的根本问题。现有解决方案要么需要耗时的人工数据重新收集和标注，要么使用可能与操作员偏好不一致的手工编码奖励函数。在这项工作中，我们提出操作员对视觉新地形的偏好（机器人应遵循这些偏好）通常可以从惯性、本体感觉和触觉领域中的已建立地形参考中进行外推。基于这一洞察，我们引入了PATERN（Preference extrApolation for Terrain awarE Robot Navigation），一种用于在视觉导航中外推操作员地形偏好的新型框架。PATERN学习将机器人的惯性、本体感觉和触觉观测映射到表征空间，并在该空间中执行最近邻搜索以估计操作员对新地形的偏好。通过在户外环境中进行的物理机器人实验，我们评估了PATERN外推偏好并泛化至新地形及挑战性光照条件的能力。与基线方法相比，我们的结果表明PATERN能够在遵循偏好对齐的导航方式下，稳健地泛化至多样地形和变化光照条件。