Recent advancements in 4D scene reconstruction using neural radiance fields (NeRF) have demonstrated the ability to represent dynamic scenes from multi-view videos. However, they fail to reconstruct the dynamic scenes and struggle to fit even the training views in unsynchronized settings. It happens because they employ a single latent embedding for a frame while the multi-view images at the frame were actually captured at different moments. To address this limitation, we introduce time offsets for individual unsynchronized videos and jointly optimize the offsets with NeRF. By design, our method is applicable for various baselines and improves them with large margins. Furthermore, finding the offsets naturally works as synchronizing the videos without manual effort. Experiments are conducted on the common Plenoptic Video Dataset and a newly built Unsynchronized Dynamic Blender Dataset to verify the performance of our method. Project page: https://seoha-kim.github.io/sync-nerf
翻译:近期基于神经辐射场(NeRF)的四维场景重建技术已展现出从多视角视频中表征动态场景的能力。然而,该类方法在非同步设置下难以重建动态场景,甚至无法拟合训练视角的输入。其根本原因在于,当某一时间帧的多视角图像实际拍摄于不同时刻时,现有方法仍采用单一隐空间编码对该帧进行表征。针对这一局限,我们提出为各非同步视频引入时延偏移参数,并将偏移量优化与NeRF联合训练。通过设计,本方法可适配多种基线模型并显著提升其性能。此外,偏移量的自动求解过程天然实现了视频同步化处理。在通用Plenoptic Video数据集及新构建的非同步动态Blender数据集上的实验验证了本方法的有效性。项目主页:https://seoha-kim.github.io/sync-nerf