High-quality estimation of surface normal can help reduce ambiguity in many geometry understanding problems, such as collision avoidance and occlusion inference. This paper presents a technique for estimating the normal from 3D point clouds and 2D colour images. We have developed a transformer neural network that learns to utilise the hybrid information of visual semantic and 3D geometric data, as well as effective learning strategies. Compared to existing methods, the information fusion of the proposed method is more effective, which is supported by experiments. We have also built a simulation environment of outdoor traffic scenes in a 3D rendering engine to obtain annotated data to train the normal estimator. The model trained on synthetic data is tested on the real scenes in the KITTI dataset. And subsequent tasks built upon the estimated normal directions in the KITTI dataset show that the proposed estimator has advantage over existing methods.
翻译:高质量的表面法线估计有助于减少许多几何理解问题中的歧义,例如碰撞避免和遮挡推理。本文提出了一种从3D点云和2D彩色图像中估计法线的技术。我们开发了一个Transformer神经网络,该网络学习利用视觉语义与3D几何数据的混合信息,并结合有效的学习策略。与现有方法相比,所提方法的信息融合更为有效,这一点已通过实验验证。此外,我们还在3D渲染引擎中构建了户外交通场景的仿真环境,以获取带标注的数据用于训练法线估计器。在KITTI数据集的真实场景上测试了基于合成数据训练的模型,并且基于KITTI数据集中估计法线方向构建的后续任务表明,所提估计器优于现有方法。