HomeRobot: Open-Vocabulary Mobile Manipulation

Sriram Yenamandra,Arun Ramachandran,Karmesh Yadav,Austin Wang,Mukul Khanna,Theophile Gervet,Tsung-Yen Yang,Vidhi Jain,Alexander William Clegg,John Turner,Zsolt Kira,Manolis Savva,Angel Chang,Devendra Singh Chaplot,Dhruv Batra,Roozbeh Mottaghi,Yonatan Bisk,Chris Paxton

from arxiv, 37 pages, 22 figures, 8 tables

HomeRobot (noun): An affordable compliant robot that navigates homes and manipulates a wide range of objects in order to complete everyday tasks. Open-Vocabulary Mobile Manipulation (OVMM) is the problem of picking any object in any unseen environment, and placing it in a commanded location. This is a foundational challenge for robots to be useful assistants in human environments, because it involves tackling sub-problems from across robotics: perception, language understanding, navigation, and manipulation are all essential to OVMM. In addition, integration of the solutions to these sub-problems poses its own substantial challenges. To drive research in this area, we introduce the HomeRobot OVMM benchmark, where an agent navigates household environments to grasp novel objects and place them on target receptacles. HomeRobot has two components: a simulation component, which uses a large and diverse curated object set in new, high-quality multi-room home environments; and a real-world component, providing a software stack for the low-cost Hello Robot Stretch to encourage replication of real-world experiments across labs. We implement both reinforcement learning and heuristic (model-based) baselines and show evidence of sim-to-real transfer. Our baselines achieve a 20% success rate in the real world; our experiments identify ways future research work improve performance. See videos on our website: https://ovmm.github.io/.

翻译：HomeRobot（名词）：一种经济实惠的柔性机器人，可在家中导航并操作多种物体以完成日常任务。开放词汇移动操作（OVMM）是指在任何未知环境中抓取任意物体并将其放置到指定位置的问题。这是使机器人成为人类环境中有效助手的基础性挑战，因为它涉及解决机器人学中的多个子问题：感知、语言理解、导航与操作对于OVMM均不可或缺。此外，这些子问题解决方案的集成本身也带来了重大挑战。为推动该领域研究，我们引入HomeRobot OVMM基准测试，要求智能体在家庭环境中导航，抓取新颖物体并将其放置到目标收纳容器中。HomeRobot包含两个组成部分：仿真组件采用大规模、多样化的精选物体集，部署于高质量的多房间家居环境；真实世界组件则为低成本Hello Robot Stretch提供软件架构，鼓励跨实验室复现真实场景实验。我们实现了强化学习与启发式（基于模型）两种基线方法，并展示了仿真到真实场景迁移的证据。我们的基线方法在真实世界中实现了20%的成功率；实验揭示了未来研究提升性能的潜在方向。详见项目网站视频：https://ovmm.github.io/。