Video frame interpolation (VFI) enables many important applications that might involve the temporal domain, such as slow motion playback, or the spatial domain, such as stop motion sequences. We are focusing on the former task, where one of the key challenges is handling high dynamic range (HDR) scenes in the presence of complex motion. To this end, we explore possible advantages of dual-exposure sensors that readily provide sharp short and blurry long exposures that are spatially registered and whose ends are temporally aligned. This way, motion blur registers temporally continuous information on the scene motion that, combined with the sharp reference, enables more precise motion sampling within a single camera shot. We demonstrate that this facilitates a more complex motion reconstruction in the VFI task, as well as HDR frame reconstruction that so far has been considered only for the originally captured frames, not in-between interpolated frames. We design a neural network trained in these tasks that clearly outperforms existing solutions. We also propose a metric for scene motion complexity that provides important insights into the performance of VFI methods at the test time.
翻译:视频帧插值(VFI)可实现众多涉及时间域(如慢动作回放)或空间域(如定格动画序列)的重要应用。本研究聚焦前一类任务,其主要挑战在于处理存在复杂运动的高动态范围(HDR)场景。为此,我们探索双曝光传感器的潜在优势:该类传感器可同步提供空间配准且时间端点对齐的清晰短曝光与模糊长曝光图像。通过这种机制,运动模糊在单次拍摄中既能捕捉场景运动的时域连续信息,又能结合清晰参考帧实现更精准的运动采样。实验证明,该方法不仅能实现VFI任务中更复杂的运动重建,还可完成此前仅针对原始捕获帧而非插值中间帧考虑的HDR帧重建任务。我们设计了针对这些任务训练的神经网络,其性能显著优于现有方案。此外,我们提出了场景运动复杂度评估指标,该指标可为测试阶段VFI方法的性能分析提供重要依据。