Generalizable neural surface reconstruction techniques have attracted great attention in recent years. However, they encounter limitations of low confidence depth distribution and inaccurate surface reasoning due to the oversimplified volume rendering process employed. In this paper, we present Reconstruction TRansformer (ReTR), a novel framework that leverages the transformer architecture to redesign the rendering process, enabling complex photon-particle interaction modeling. It introduces a learnable meta-ray token and utilizes the cross-attention mechanism to simulate the interaction of photons with sampled points and render the observed color. Meanwhile, by operating within a high-dimensional feature space rather than the color space, ReTR mitigates sensitivity to projected colors in source views. Such improvements result in accurate surface assessment with high confidence. We demonstrate the effectiveness of our approach on various datasets, showcasing how our method outperforms the current state-of-the-art approaches in terms of reconstruction quality and generalization ability.
翻译:可泛化神经表面重建技术近年来引起了广泛关注。然而,由于采用过于简化的体渲染过程,这些技术面临深度分布置信度低和表面推理不准确等局限性。本文提出Reconstruction Transformer(ReTR)——一种利用Transformer架构重新设计渲染过程的新框架,能够对复杂的光子-粒子相互作用进行建模。该框架引入可学习的元射线标记,并利用交叉注意力机制模拟光子与采样点的相互作用以渲染观测颜色。同时,通过在高维特征空间而非颜色空间中进行运算,ReTR减轻了对源视图投影颜色的敏感性。这些改进使得表面评估具有高置信度。我们在多个数据集上验证了该方法的有效性,展示了我们的方法在重建质量和泛化能力方面如何超越当前最先进的方法。