Video frame interpolation methodologies endeavor to create novel frames betwixt extant ones, with the intent of augmenting the video's frame frequency. However, current methods are prone to image blurring and spurious artifacts in challenging scenarios involving occlusions and discontinuous motion. Moreover, they typically rely on optical flow estimation, which adds complexity to modeling and computational costs. To address these issues, we introduce a Motion-Aware Video Frame Interpolation (MA-VFI) network, which directly estimates intermediate optical flow from consecutive frames by introducing a novel hierarchical pyramid module. It not only extracts global semantic relationships and spatial details from input frames with different receptive fields, enabling the model to capture intricate motion patterns, but also effectively reduces the required computational cost and complexity. Subsequently, a cross-scale motion structure is presented to estimate and refine intermediate flow maps by the extracted features. This approach facilitates the interplay between input frame features and flow maps during the frame interpolation process and markedly heightens the precision of the intervening flow delineations. Finally, a discerningly fashioned loss centered around an intermediate flow is meticulously contrived, serving as a deft rudder to skillfully guide the prognostication of said intermediate flow, thereby substantially refining the precision of the intervening flow mappings. Experiments illustrate that MA-VFI surpasses several representative VFI methods across various datasets, and can enhance efficiency while maintaining commendable efficacy.
翻译:视频帧插值方法致力于在已有帧之间生成新帧,旨在提升视频的帧率。然而,当前方法在涉及遮挡和非连续运动的复杂场景中,容易出现图像模糊和伪影。此外,这些方法通常依赖于光流估计,这增加了建模复杂性和计算成本。为解决这些问题,我们提出了一种运动感知视频帧插值(MA-VFI)网络,该网络通过引入新颖的分层金字塔模块,直接从连续帧中估计中间光流。该模块不仅能从输入帧的不同感受野中提取全局语义关系和空间细节,使模型能够捕捉复杂的运动模式,还能有效降低所需的计算成本和复杂度。随后,我们提出了一种跨尺度运动结构,利用提取的特征来估计和细化中间光流图。该方法促进了帧插值过程中输入帧特征与光流图之间的交互,显著提高了中间流绘制的精度。最后,我们精心设计了一个以中间光流为中心的判别性损失函数,作为灵巧的舵手熟练地指导中间光流的预测,从而大幅提升中间流映射的精度。实验表明,MA-VFI在各种数据集上超越了多种代表性VFI方法,并且能够在保持良好效果的同时提升效率。