We present an automated and efficient approach for retrieving high-quality CAD models of objects and their poses in a scene captured by a moving RGB-D camera. We first investigate various objective functions to measure similarity between a candidate CAD object model and the available data, and the best objective function appears to be a "render-and-compare" method comparing depth and mask rendering. We thus introduce a fast-search method that approximates an exhaustive search based on this objective function for simultaneously retrieving the object category, a CAD model, and the pose of an object given an approximate 3D bounding box. This method involves a search tree that organizes the CAD models and object properties including object category and pose for fast retrieval and an algorithm inspired by Monte Carlo Tree Search, that efficiently searches this tree. We show that this method retrieves CAD models that fit the real objects very well, with a speed-up factor of 10x to 120x compared to exhaustive search.
翻译:我们提出了一种自动化且高效的方法,用于从移动RGB-D相机捕捉的场景中检索物体的高质量CAD模型及其位姿。首先,我们研究了多种衡量候选CAD物体模型与可用数据相似度的目标函数,其中最优的目标函数是一种基于深度图和掩膜渲染的“渲染-比较”方法。为此,我们引入了一种快速搜索方法,该方法基于该目标函数近似穷举搜索,从而同时检索物体的类别、CAD模型和给定近似三维边界框的物体位姿。该方法构建了一个搜索树,该树组织CAD模型及物体属性(包括物体类别和位姿)以实现快速检索,并采用一种受蒙特卡洛树搜索启发的算法来高效遍历该树。实验表明,该方法检索出的CAD模型与真实物体拟合度极高,且相比穷举搜索实现了10倍至120倍的加速。