Recent advancements in 4D scene reconstruction using neural radiance fields (NeRF) have demonstrated the ability to represent dynamic scenes from multi-view videos. However, they fail to reconstruct the dynamic scenes and struggle to fit even the training views in unsynchronized settings. It happens because they employ a single latent embedding for a frame while the multi-view images at the same frame were actually captured at different moments. To address this limitation, we introduce time offsets for individual unsynchronized videos and jointly optimize the offsets with NeRF. By design, our method is applicable for various baselines and improves them with large margins. Furthermore, finding the offsets naturally works as synchronizing the videos without manual effort. Experiments are conducted on the common Plenoptic Video Dataset and a newly built Unsynchronized Dynamic Blender Dataset to verify the performance of our method. Project page: https://seoha-kim.github.io/sync-nerf
翻译:近期利用神经辐射场(NeRF)进行4D场景重建的研究已展示出从多视角视频中表示动态场景的能力。然而,在非同步设置下,现有方法无法重建动态场景,甚至在训练视角的拟合上也存在困难。这是因为这些方法为每一帧使用单一的隐空间嵌入,而同一帧的多视角图像实际上是在不同时刻捕获的。为解决这一局限性,我们为每个非同步视频引入时间偏移量,并将偏移量与NeRF进行联合优化。通过设计,我们的方法可适用于多种基线模型,并大幅提升其性能。此外,偏移量的自动求解天然实现了视频同步,无需人工干预。我们在通用的Plenoptic Video数据集以及新构建的非同步动态Blender数据集上进行了实验,以验证所提方法的性能。项目页面:https://seoha-kim.github.io/sync-nerf