We present NeuralLabeling, a labeling approach and toolset for annotating a scene using either bounding boxes or meshes and generating segmentation masks, affordance maps, 2D bounding boxes, 3D bounding boxes, 6DOF object poses, depth maps and object meshes. NeuralLabeling uses Neural Radiance Fields (NeRF) as renderer, allowing labeling to be performed using 3D spatial tools while incorporating geometric clues such as occlusions, relying only on images captured from multiple viewpoints as input. To demonstrate the applicability of NeuralLabeling to a practical problem in robotics, we added ground truth depth maps to 30000 frames of transparent object RGB and noisy depth maps of glasses placed in a dishwasher captured using an RGBD sensor, yielding the Dishwasher30k dataset. We show that training a simple deep neural network with supervision using the annotated depth maps yields a higher reconstruction performance than training with the previously applied weakly supervised approach.
翻译:我们提出NeuralLabeling——一种使用边界框或网格对场景进行标注,并生成分割掩膜、可应用性图、2D边界框、3D边界框、6自由度目标位姿、深度图及目标网格的标注方法与工具集。NeuralLabeling以神经辐射场(NeRF)作为渲染器,仅依靠多视角图像作为输入,即可利用3D空间工具进行标注,同时整合遮挡等几何线索。为验证该方法在机器人实际应用中的可行性,我们使用RGBD传感器采集了洗碗机中透明玻璃杯的RGB图像及含噪声深度图,并对其中30000帧数据添加了真实深度图,由此构建Dishwasher30k数据集。实验表明,利用标注深度图进行监督训练简单深度神经网络,相比先前采用的弱监督方法能获得更高的重建性能。