Digital video inpainting techniques have been substantially improved with deep learning in recent years. Although inpainting is originally designed to repair damaged areas, it can also be used as malicious manipulation to remove important objects for creating false scenes and facts. As such it is significant to identify inpainted regions blindly. In this paper, we present a Trusted Video Inpainting Localization network (TruVIL) with excellent robustness and generalization ability. Observing that high-frequency noise can effectively unveil the inpainted regions, we design deep attentive noise learning in multiple stages to capture the inpainting traces. Firstly, a multi-scale noise extraction module based on 3D High Pass (HP3D) layers is used to create the noise modality from input RGB frames. Then the correlation between such two complementary modalities are explored by a cross-modality attentive fusion module to facilitate mutual feature learning. Lastly, spatial details are selectively enhanced by an attentive noise decoding module to boost the localization performance of the network. To prepare enough training samples, we also build a frame-level video object segmentation dataset of 2500 videos with pixel-level annotation for all frames. Extensive experimental results validate the superiority of TruVIL compared with the state-of-the-arts. In particular, both quantitative and qualitative evaluations on various inpainted videos verify the remarkable robustness and generalization ability of our proposed TruVIL. Code and dataset will be available at https://github.com/multimediaFor/TruVIL.
翻译:近年来,深度学习技术显著推动了数字视频修复方法的发展。尽管修复技术最初旨在修复受损区域,但其亦可被恶意用于移除关键物体以伪造场景与事实。因此,对修复区域进行盲检测具有重要意义。本文提出一种具备优异鲁棒性与泛化能力的可信视频修复定位网络(TruVIL)。通过观察发现高频噪声能有效揭示修复区域,我们设计了多阶段深度注意力噪声学习机制以捕捉修复痕迹。首先,基于三维高通(HP3D)层的多尺度噪声提取模块从输入RGB帧中生成噪声模态;随后,通过跨模态注意力融合模块探索RGB与噪声两种互补模态间的关联性,以促进特征交互学习;最后,通过注意力噪声解码模块选择性增强空间细节,从而提升网络的定位性能。为准备充足训练样本,我们构建了包含2500个视频的帧级视频对象分割数据集,所有帧均具备像素级标注。大量实验结果验证了TruVIL相较于现有最优方法的优越性。特别是在各类修复视频上的定量与定性评估均表明,我们所提出的TruVIL具有显著的鲁棒性与泛化能力。代码与数据集将在https://github.com/multimediaFor/TruVIL 公开。