In this paper, we study the problem of continuous 3D shape representations. The majority of existing successful methods are coordinate-based implicit neural representations. However, they are inefficient to render novel views or recover explicit surface points. A few works start to formulate 3D shapes as ray-based neural functions, but the learned structures are inferior due to the lack of multi-view geometry consistency. To tackle these challenges, we propose a new framework called RayDF. It consists of three major components: 1) the simple ray-surface distance field, 2) the novel dual-ray visibility classifier, and 3) a multi-view consistency optimization module to drive the learned ray-surface distances to be multi-view geometry consistent. We extensively evaluate our method on three public datasets, demonstrating remarkable performance in 3D surface point reconstruction on both synthetic and challenging real-world 3D scenes, clearly surpassing existing coordinate-based and ray-based baselines. Most notably, our method achieves a 1000x faster speed than coordinate-based methods to render an 800x800 depth image, showing the superiority of our method for 3D shape representation. Our code and data are available at https://github.com/vLAR-group/RayDF
翻译:本文研究了连续三维形状表示的问题。现有大多数成功方法基于坐标的隐式神经表示,但它们在渲染新视角或恢复显式表面点时效率较低。少数工作开始将三维形状建模为基于射线的神经函数,但由于缺乏多视角几何一致性,学习到的结构表现欠佳。为解决这些挑战,我们提出了名为RayDF的新框架。该框架包含三个主要部分:1)简单的射线-表面距离场,2)新颖的双射线可见性分类器,3)多视角一致性优化模块,用于驱动学习到的射线-表面距离满足多视角几何一致性。我们在三个公开数据集上进行了广泛评估,结果表明,在合成和具有挑战性的真实三维场景中,我们的方法在三维表面点重建上表现出卓越性能,明显超越现有的基于坐标和基于射线的基线方法。值得注意的是,我们的方法在渲染800×800深度图像时,速度比基于坐标的方法快1000倍,凸显了该方法在三维形状表示中的优越性。我们的代码和数据可在 https://github.com/vLAR-group/RayDF 获取。