Reconstructing transparent objects using affordable RGB-D cameras is a persistent challenge in robotic perception due to inconsistent appearances across views in the RGB domain and inaccurate depth readings in each single-view. We introduce a two-stage pipeline for reconstructing transparent objects tailored for mobile platforms. In the first stage, off-the-shelf monocular object segmentation and depth completion networks are leveraged to predict the depth of transparent objects, furnishing single-view shape prior. Subsequently, we propose Epipolar-guided Optical Flow (EOF) to fuse several single-view shape priors from the first stage to a cross-view consistent 3D reconstruction given camera poses estimated from opaque part of the scene. Our key innovation lies in EOF which employs boundary-sensitive sampling and epipolar-line constraints into optical flow to accurately establish 2D correspondences across multiple views on transparent objects. Quantitative evaluations demonstrate that our pipeline significantly outperforms baseline methods in 3D reconstruction quality, paving the way for more adept robotic perception and interaction with transparent objects.
翻译:使用低成本RGB-D相机重建透明物体是机器人感知领域的一项持续挑战,原因在于RGB域中跨视图外观不一致以及单视角深度读数不准确。我们提出了一种面向移动平台的两阶段透明物体重建流程。第一阶段,利用现成的单目物体分割与深度补全网络预测透明物体的深度,提供单视角形状先验。随后,我们提出极线引导光流(EOF)方法,基于场景不透明部分估计的相机位姿,将第一阶段获得的多个单视角形状先验融合为跨视图一致的3D重建。我们的关键创新在于EOF,它通过引入边界敏感采样与极线约束到光流中,在透明物体上精确建立多视图间的二维对应关系。定量评估表明,我们的流程在3D重建质量上显著优于基线方法,为机器人更熟练地感知与交互透明物体铺平了道路。