Dynamic Neural Radiance Field (NeRF) from monocular videos has recently been explored for space-time novel view synthesis and achieved excellent results. However, defocus blur caused by depth variation often occurs in video capture, compromising the quality of dynamic reconstruction because the lack of sharp details interferes with modeling temporal consistency between input views. To tackle this issue, we propose D2RF, the first dynamic NeRF method designed to restore sharp novel views from defocused monocular videos. We introduce layered Depth-of-Field (DoF) volume rendering to model the defocus blur and reconstruct a sharp NeRF supervised by defocused views. The blur model is inspired by the connection between DoF rendering and volume rendering. The opacity in volume rendering aligns with the layer visibility in DoF rendering. To execute the blurring, we modify the layered blur kernel to the ray-based kernel and employ an optimized sparse kernel to gather the input rays efficiently and render the optimized rays with our layered DoF volume rendering. We synthesize a dataset with defocused dynamic scenes for our task, and extensive experiments on our dataset show that our method outperforms existing approaches in synthesizing all-in-focus novel views from defocus blur while maintaining spatial-temporal consistency in the scene.
翻译:从单目视频重建动态神经辐射场(NeRF)近年来在时空新视角合成方面取得了显著进展。然而,视频采集过程中常因深度变化产生离焦模糊,由于缺乏清晰细节会干扰输入视图间时间一致性的建模,从而影响动态重建的质量。为解决此问题,我们提出了D2RF——首个专为从离焦单目视频恢复清晰新视角而设计的动态NeRF方法。我们引入分层景深(DoF)体渲染来建模离焦模糊,并通过离焦视图监督重建清晰的NeRF。该模糊模型的灵感来源于景深渲染与体渲染之间的内在关联:体渲染中的不透明度与景深渲染中的层级可见性具有对应关系。为执行模糊操作,我们将分层模糊核改进为基于光线的核函数,并采用优化的稀疏核来高效采集输入光线,再通过我们的分层景深体渲染对优化后的光线进行绘制。我们为此任务合成了一个包含离焦动态场景的数据集,在该数据集上的大量实验表明,我们的方法在从离焦模糊合成全焦点新视角方面优于现有方法,同时保持了场景的时空一致性。