Depth cameras are frequently used in robotic manipulation, e.g. for visual servoing. The quality of small and compact depth cameras is though often not sufficient for depth reconstruction, which is required for precise tracking in and perception of the robot's working space. Based on the work of Shabanov et al. (2021), in this work, we present a self-supervised multi-object depth denoising pipeline, that uses depth maps of higher-quality sensors as close-to-ground-truth supervisory signals to denoise depth maps coming from a lower-quality sensor. We display a computationally efficient way to align sets of two frame pairs in space and retrieve a frame-based multi-object mask, in order to receive a clean labeled dataset to train a denoising neural network on. The implementation of our presented work can be found at https://github.com/alr-internship/self-supervised-depth-denoising.
翻译:深度相机常被用于机器人操作,例如视觉伺服。然而,小型紧凑深度相机的质量通常不足以满足深度重建的要求,而精确跟踪和感知机器人工作空间需要高精度深度重建。基于Shabanov等人(2021)的工作,本文提出了一种自监督多目标深度去噪流程,该流程利用高质量传感器的深度图作为接近真值的监督信号,对低质量传感器产生的深度图进行去噪。我们展示了一种高效的时空对齐方法,对两帧图像对进行空间配准,并提取基于帧的多目标掩码,从而获得干净的标记数据集,用于训练去噪神经网络。本研究的实现代码可从https://github.com/alr-internship/self-supervised-depth-denoising获取。