In this paper, we study the problem of continuous 3D shape representations. The majority of existing successful methods are coordinate-based implicit neural representations. However, they are inefficient to render novel views or recover explicit surface points. A few works start to formulate 3D shapes as ray-based neural functions, but the learned structures are inferior due to the lack of multi-view geometry consistency. To tackle these challenges, we propose a new framework called RayDF. It consists of three major components: 1) the simple ray-surface distance field, 2) the novel dual-ray visibility classifier, and 3) a multi-view consistency optimization module to drive the learned ray-surface distances to be multi-view geometry consistent. We extensively evaluate our method on three public datasets, demonstrating remarkable performance in 3D surface point reconstruction on both synthetic and challenging real-world 3D scenes, clearly surpassing existing coordinate-based and ray-based baselines. Most notably, our method achieves a 1000x faster speed than coordinate-based methods to render an 800x800 depth image, showing the superiority of our method for 3D shape representation. Our code and data are available at https://github.com/vLAR-group/RayDF
翻译:本文研究了连续3D形状表示的问题。多数现有成功方法基于坐标的隐式神经表示,但它们在渲染新视角或恢复显式表面点方面效率低下。少数工作尝试将3D形状建模为基于射线的神经函数,但由于缺乏多视角几何一致性,学习到的结构质量较差。为解决这些挑战,我们提出新框架RayDF,包含三个主要组件:1)简单的射线-表面距离场,2)新颖的双射线可见性分类器,3)多视角一致性优化模块,用于驱动学习到的射线-表面距离具备多视角几何一致性。我们在三个公开数据集上进行了广泛评估,在合成和具有挑战性的真实世界3D场景中均展现出卓越的3D表面点重建性能,明显优于现有的基于坐标和基于射线的方法。最值得注意的是,我们的方法在渲染800×800深度图像时,速度比基于坐标的方法快1000倍,凸显了该方法在3D形状表示中的优越性。我们的代码和数据可在https://github.com/vLAR-group/RayDF获取。