Video frame interpolation(VFI) has witnessed great progress in recent years. While existing VFI models still struggle to achieve a good trade-off between accuracy and efficiency: fast models often have inferior accuracy; accurate models typically run slowly. However, easy samples with small motion or clear texture can achieve competitive results with simple models and do not require heavy computation. In this paper, we present an integrated pipeline which combines difficulty assessment with video frame interpolation. Specifically, it firstly leverages a pre-assessment model to measure the interpolation difficulty level of input frames, and then dynamically selects an appropriate VFI model to generate interpolation results. Furthermore, a large-scale VFI difficulty assessment dataset is collected and annotated to train our pre-assessment model. Extensive experiments show that easy samples pass through fast models while difficult samples inference with heavy models, and our proposed pipeline can improve the accuracy-efficiency trade-off for VFI.
翻译:视频帧插值(VFI)近年来取得了显著进展。然而,现有VFI模型仍难以在精度与效率之间实现良好权衡:快速模型往往精度较低,而高精度模型通常运行缓慢。实际上,对于运动幅度小或纹理清晰等简单样本,简单模型即可获得竞争性结果且无需大量计算。本文提出一种将难度评估与视频帧插值相结合的集成流水线。具体而言,该方法首先利用预评估模型衡量输入帧的插值难度等级,随后动态选择适当的VFI模型生成插值结果。此外,我们收集并标注了一个大规模VFI难度评估数据集以训练该预评估模型。大量实验表明,简单样本通过快速模型推断,困难样本则使用重型模型处理,本文提出的流水线能够有效提升VFI的精度与效率权衡。