Implicit Shape Model Trees: Recognition of 3-D Indoor Scenes and Prediction of Object Poses for Mobile Robots

For a mobile robot, we present an approach to recognize scenes in arrangements of objects distributed over cluttered environments. Recognition is made possible by letting the robot alternately search for objects and assign found objects to scenes. Our scene model "Implicit Shape Model (ISM) trees" allows us to solve these two tasks together. For the ISM trees, this article presents novel algorithms for recognizing scenes and predicting the poses of searched objects. We define scenes as sets of objects, where some objects are connected by 3-D spatial relations. In previous work, we recognized scenes using single ISMs. However, these ISMs were prone to false positives. To address this problem, we introduced ISM trees, a hierarchical model that includes multiple ISMs. Through the recognition algorithm it contributes, this article ultimately enables the use of ISM trees in scene recognition. We intend to enable users to generate ISM trees from object arrangements demonstrated by humans. The lack of a suitable algorithm is overcome by the introduction of an ISM tree generation algorithm. In scene recognition, it is usually assumed that image data is already available. However, this is not always the case for robots. For this reason, we combined scene recognition and object search in previous work. However, we did not provide an efficient algorithm to link the two tasks. This article introduces such an algorithm that predicts the poses of searched objects with relations. Experiments show that our overall approach enables robots to find and recognize object arrangements that cannot be perceived from a single viewpoint.

翻译：针对移动机器人，我们提出一种方法，用于识别分布于杂乱环境中的物体排列所构成的场景。该方法通过让机器人交替搜索物体并将找到的物体分配至场景来实现识别。我们的场景模型"隐式形状模型树"能够同时解决这两个任务。针对ISM树，本文提出了用于场景识别和搜索物体位姿预测的新算法。我们将场景定义为物体集合，其中部分物体通过三维空间关系相连。在前期工作中，我们采用单一ISM进行场景识别，但这些ISM容易产生误报。为解决该问题，我们引入了ISM树——一种包含多个ISM的层次化模型。通过所提出的识别算法，本文最终实现了ISM树在场景识别中的应用。我们的目标是让用户能够从人类演示的物体排列中生成ISM树。通过引入ISM树生成算法，克服了现有算法缺失的问题。在场景识别中通常假设图像数据已存在，但这对机器人而言并非总是成立。为此，我们在前期工作中将场景识别与物体搜索相结合，但未提供有效算法来关联这两个任务。本文提出的算法可通过空间关系预测搜索物体的位姿。实验表明，我们的整体方法能使机器人发现并识别出单视角无法观测到的物体排列。