Large motion poses a critical challenge in Video Frame Interpolation (VFI) task. Existing methods are often constrained by limited receptive fields, resulting in sub-optimal performance when handling scenarios with large motion. In this paper, we introduce a new pipeline for VFI, which can effectively integrate global-level information to alleviate issues associated with large motion. Specifically, we first estimate a pair of initial intermediate flows using a high-resolution feature map for extracting local details. Then, we incorporate a sparse global matching branch to compensate for flow estimation, which consists of identifying flaws in initial flows and generating sparse flow compensation with a global receptive field. Finally, we adaptively merge the initial flow estimation with global flow compensation, yielding a more accurate intermediate flow. To evaluate the effectiveness of our method in handling large motion, we carefully curate a more challenging subset from commonly used benchmarks. Our method demonstrates the state-of-the-art performance on these VFI subsets with large motion.
翻译:大运动对视频帧插值(VFI)任务构成了关键挑战。现有方法常受限于有限的感受野,导致在处理大运动场景时性能欠佳。本文提出了一种新的VFI流程,能够有效整合全局信息以缓解大运动相关的问题。具体而言,我们首先利用高分辨率特征图估计一对初始中间光流,以提取局部细节。然后,我们引入一个稀疏全局匹配分支来补偿光流估计,该分支包括识别初始光流中的缺陷,并利用全局感受野生成稀疏光流补偿。最后,我们自适应地融合初始光流估计与全局光流补偿,从而得到更精确的中间光流。为评估所提方法在处理大运动方面的有效性,我们从常用基准中精心筛选出更具挑战性的子集。实验结果表明,我们的方法在这些具有大运动的VFI子集上达到了最先进的性能。