Searching for objects is a fundamental skill for robots. As such, we expect object search to eventually become an off-the-shelf capability for robots, similar to e.g., object detection and SLAM. In contrast, however, no system for 3D object search exists that generalizes across real robots and environments. In this paper, building upon a recent theoretical framework that exploited the octree structure for representing belief in 3D, we present GenMOS (Generalized Multi-Object Search), the first general-purpose system for multi-object search (MOS) in a 3D region that is robot-independent and environment-agnostic. GenMOS takes as input point cloud observations of the local region, object detection results, and localization of the robot's view pose, and outputs a 6D viewpoint to move to through online planning. In particular, GenMOS uses point cloud observations in three ways: (1) to simulate occlusion; (2) to inform occupancy and initialize octree belief; and (3) to sample a belief-dependent graph of view positions that avoid obstacles. We evaluate our system both in simulation and on two real robot platforms. Our system enables, for example, a Boston Dynamics Spot robot to find a toy cat hidden underneath a couch in under one minute. We further integrate 3D local search with 2D global search to handle larger areas, demonstrating the resulting system in a 25m$^2$ lobby area.
翻译:搜索物体是机器人的一项基本技能。因此,我们期望物体搜索最终能成为机器人的现成能力,类似于物体检测和SLAM等。然而,目前尚不存在能够在真实机器人和环境中泛化的三维物体搜索系统。本文基于近期利用八叉树结构表示三维信念的理论框架,提出了GenMOS(通用多目标搜索系统)——首个能够在三维区域内进行机器人无关、环境无关的多目标搜索通用系统。GenMOS以局部区域的点云观测、物体检测结果及机器人视点定位为输入,通过在线规划输出一个六自由度视点作为移动目标。具体而言,GenMOS通过三种方式利用点云观测:(1) 模拟遮挡;(2) 告知占据状态并初始化八叉树信念;(3) 采样避障的信念依赖视点图。我们在仿真环境及两个真实机器人平台上评估了该系统。例如,该系统可使波士顿动力的Spot机器人在一分钟内找到藏于沙发下的玩具猫。此外,我们将三维局部搜索与二维全局搜索相结合以覆盖更大区域,并在25平方米的大厅区域展示了最终系统的性能。