Recent advances in neural rendering have enabled highly photorealistic 3D scene reconstruction and novel view synthesis. Despite this progress, current state-of-the-art methods struggle to reconstruct high frequency detail, due to factors such as a low-frequency bias of radiance fields and inaccurate camera calibration. One approach to mitigate this issue is to enhance images post-rendering. 2D enhancers can be pre-trained to recover some detail but are agnostic to scene geometry and do not easily generalize to new distributions of image degradation. Conversely, existing 3D enhancers are able to transfer detail from nearby training images in a generalizable manner, but suffer from inaccurate camera calibration and can propagate errors from the geometry into rendered images. We propose a neural rendering enhancer, RoGUENeRF, which exploits the best of both paradigms. Our method is pre-trained to learn a general enhancer while also leveraging information from nearby training images via robust 3D alignment and geometry-aware fusion. Our approach restores high-frequency textures while maintaining geometric consistency and is also robust to inaccurate camera calibration. We show that RoGUENeRF substantially enhances the rendering quality of a wide range of neural rendering baselines, e.g. improving the PSNR of MipNeRF360 by 0.63dB and Nerfacto by 1.34dB on the real world 360v2 dataset.
翻译:神经渲染的最新进展已能实现高度逼真的三维场景重建和新视角合成。尽管取得了这些进展,由于辐射场的低频偏差和相机标定不准确等因素,当前最先进的方法仍难以重建高频细节。缓解此问题的一种方法是在渲染后对图像进行增强。二维增强器可通过预训练恢复部分细节,但其对场景几何结构无感知,且不易泛化至新的图像退化分布。相反,现有的三维增强器能够以可泛化的方式从邻近训练图像中迁移细节,但受限于相机标定不准确,并可能将几何误差传播至渲染图像中。我们提出了一种神经渲染增强器RoGUENeRF,它融合了两种范式的优势。我们的方法通过预训练学习通用增强器,同时通过鲁棒的三维对齐和几何感知融合利用邻近训练图像的信息。该方法在恢复高频纹理的同时保持几何一致性,并对不准确的相机标定具有鲁棒性。实验表明,RoGUENeRF显著提升了多种神经渲染基线的渲染质量,例如在真实世界360v2数据集上,将MipNeRF360的PSNR提升了0.63dB,将Nerfacto的PSNR提升了1.34dB。