In order to successfully perform manipulation tasks in new environments, such as grasping, robots must be proficient in segmenting unseen objects from the background and/or other objects. Previous works perform unseen object instance segmentation (UOIS) by training deep neural networks on large-scale data to learn RGB/RGB-D feature embeddings, where cluttered environments often result in inaccurate segmentations. We build upon these methods and introduce a novel approach to correct inaccurate segmentation, such as under-segmentation, of static image-based UOIS masks by using robot interaction and a designed body frame-invariant feature. We demonstrate that the relative linear and rotational velocities of frames randomly attached to rigid bodies due to robot interactions can be used to identify objects and accumulate corrected object-level segmentation masks. By introducing motion to regions of segmentation uncertainty, we are able to drastically improve segmentation accuracy in an uncertainty-driven manner with minimal, non-disruptive interactions (ca. 2-3 per scene). We demonstrate the effectiveness of our proposed interactive perception pipeline in accurately segmenting cluttered scenes by achieving an average object segmentation accuracy rate of 80.7%, an increase of 28.2% when compared with other state-of-the-art UOIS methods.
翻译:摘要:为了在新环境中成功执行抓取等操作任务,机器人必须具备从背景和/或其他物体中分割出未知物体的能力。以往的研究通过在大规模数据上训练深度神经网络学习RGB/RGB-D特征嵌入来实现未知物体实例分割(UOIS),但杂乱环境常导致分割不准确。我们在此类方法基础上提出一种新方法,利用机器人交互和设计的身体框架不变特征,对静态图像UOIS掩膜中诸如欠分割等不准确分割进行修正。我们证明,机器人交互作用下,随机附着于刚体的框架产生的相对线速度和角速度可用于识别物体并累积修正后的物体级分割掩膜。通过向分割不确定性区域引入运动,我们能够以不确定性驱动的方式,在最小且非干扰性交互(每场景约2-3次)下大幅提升分割精度。通过实验证明,我们提出的交互式感知流水线在杂乱场景分割中效果显著,平均物体分割准确率达80.7%,相较于其他最先进UOIS方法提升28.2%。