Home-assistant robots have been a long-standing research topic, and one of the biggest challenges is searching for required objects in housing environments. Previous object-goal navigation requires the robot to search for a target object category in an unexplored environment, which may not be suitable for home-assistant robots that typically have some level of semantic knowledge of the environment, such as the location of static furniture. In our approach, we leverage this knowledge and the fact that a target object may be located close to its related objects for efficient navigation. To achieve this, we train a graph neural network using the Visual Genome dataset to learn the object co-occurrence relationships and formulate the searching process as iteratively predicting the possible areas where the target object may be located. This approach is entirely zero-shot, meaning it doesn't require new accurate object correlation in the test environment. We empirically show that our method outperforms prior correlational object search algorithms. As our ultimate goal is to build fully autonomous assistant robots for everyday use, we further integrate the task planner for parsing natural language and generating task-completing plans with object navigation to execute human instructions. We demonstrate the effectiveness of our proposed pipeline in both the AI2-THOR simulator and a Stretch robot in a real-world environment.
翻译:家用辅助机器人一直是长期研究课题,其中最大的挑战之一是在住宅环境中搜索所需物体。以往的物体目标导航要求机器人在未探索环境中搜索特定目标物体类别,这并不适用于通常对环境(如静态家具位置)具备一定语义知识的家用辅助机器人。在我们的方法中,利用这种先验知识以及目标物体可能位于其关联物体附近的特性,以实现高效导航。为此,我们使用Visual Genome数据集训练图神经网络来学习物体共现关系,并将搜索过程建模为迭代预测目标物体可能所在的区域。该方法完全采用零样本学习,即无需测试环境中新的精确物体关联。实验表明,我们的方法优于先前的关联性物体搜索算法。由于最终目标是构建日常使用的全自主辅助机器人,我们进一步集成任务规划器以解析自然语言并生成任务完成方案,同时结合物体导航执行人类指令。我们在AI2-THOR模拟器和真实环境中的Stretch机器人上验证了所提出管道的有效性。