Recent image manipulation localization and detection techniques usually leverage forensic artifacts and traces that are produced by a noise-sensitive filter, such as SRM and Bayar convolution. In this paper, we showcase that different filters commonly used in such approaches excel at unveiling different types of manipulations and provide complementary forensic traces. Thus, we explore ways of merging the outputs of such filters and aim to leverage the complementary nature of the artifacts produced to perform image manipulation localization and detection (IMLD). We propose two distinct methods: one that produces independent features from each forensic filter and then fuses them (this is referred to as late fusion) and one that performs early mixing of different modal outputs and produces early combined features (this is referred to as early fusion). We demonstrate that both approaches achieve competitive performance for both image manipulation localization and detection, outperforming state-of-the-art models across several datasets.
翻译:近期图像篡改定位与检测技术通常利用由噪声敏感滤波器(如SRM和Bayar卷积)产生的取证痕迹与线索。本文展示了此类方法中常用的不同滤波器在揭示不同类型篡改操作时各具优势,并能提供互补的取证痕迹。因此,我们探索了融合这些滤波器输出结果的方法,旨在利用所产生的证据痕迹的互补特性,实现图像篡改定位与检测(IMLD)。我们提出了两种不同的方法:一种从每个取证滤波器中生成独立特征再进行融合(称为晚期融合),另一种则对不同模态输出进行早期混合并生成早期组合特征(称为早期融合)。实验证明,这两种方法在图像篡改定位与检测任务中均能达到具有竞争力的性能,并在多个数据集上优于现有最先进模型。