Large motion poses a critical challenge in Video Frame Interpolation (VFI) task. Existing methods are often constrained by limited receptive fields, resulting in sub-optimal performance when handling scenarios with large motion. In this paper, we introduce a new pipeline for VFI, which can effectively integrate global-level information to alleviate issues associated with large motion. Specifically, we first estimate a pair of initial intermediate flows using a high-resolution feature map for extracting local details. Then, we incorporate a sparse global matching branch to compensate for flow estimation, which consists of identifying flaws in initial flows and generating sparse flow compensation with a global receptive field. Finally, we adaptively merge the initial flow estimation with global flow compensation, yielding a more accurate intermediate flow. To evaluate the effectiveness of our method in handling large motion, we carefully curate a more challenging subset from commonly used benchmarks. Our method demonstrates the state-of-the-art performance on these VFI subsets with large motion.
翻译:大运动是视频帧插值(VFI)任务中的一个关键挑战。现有方法通常受限于有限的感受野,导致在处理大运动场景时表现欠佳。本文提出了一种新的VFI流水线,其能够有效整合全局信息以缓解大运动带来的问题。具体而言,我们首先利用高分辨率特征图进行局部细节提取,估计出一对初始中间光流。随后,我们引入一个稀疏全局匹配分支对光流估计进行补偿,该分支通过识别初始光流中的缺陷并利用全局感受野生成稀疏光流补偿值。最后,我们自适应地融合初始光流估计与全局光流补偿,得到更精确的中间光流。为评估本方法在大运动场景下的有效性,我们从常用基准测试集中精心筛选出更具挑战性的子集。实验表明,本方法在这些大运动VFI子集上达到了最先进的性能。