Most previous works of outdoor instance segmentation for images only use color information. We explore a novel direction of sensor fusion to exploit stereo cameras. Geometric information from disparities helps separate overlapping objects of the same or different classes. Moreover, geometric information penalizes region proposals with unlikely 3D shapes thus suppressing false positive detections. Mask regression is based on 2D, 2.5D, and 3D ROI using the pseudo-lidar and image-based representations. These mask predictions are fused by a mask scoring process. However, public datasets only adopt stereo systems with shorter baseline and focal legnth, which limit measuring ranges of stereo cameras. We collect and utilize High-Quality Driving Stereo (HQDS) dataset, using much longer baseline and focal length with higher resolution. Our performance attains state of the art. Please refer to our project page. The full paper is available here.
翻译:以往的室外图像实例分割工作大多仅利用颜色信息。本研究探索了一种新颖的传感器融合方向,通过立体相机实现几何信息利用。来自视差的几何信息有助于分离同类或不同类别的重叠物体。此外,几何信息能够惩罚具有不合理三维形状的区域候选框,从而抑制误检。掩码回归基于2D、2.5D和3D感兴趣区域(ROI),采用伪激光雷达与图像表征方法。这些掩码预测通过掩码评分过程进行融合。然而,现有公开数据集仅采用基线较短、焦距较短的立体系统,限制了立体相机的测量范围。本研究收集并利用高质量驾驶立体(HQDS)数据集,该数据集采用更长的基线、更长的焦距及更高分辨率。我们的性能达到了当前最优水平。详情请参阅项目页面,完整论文见链接。