Powerful manipulation techniques have made digital image forgeries be easily created and widespread without leaving visual anomalies. The blind localization of tampered regions becomes quite significant for image forensics. In this paper, we propose an effective image tampering localization network (EITLNet) based on a two-branch enhanced transformer encoder with attention-based feature fusion. Specifically, a feature enhancement module is designed to enhance the feature representation ability of the transformer encoder. The features extracted from RGB and noise streams are fused effectively by the coordinate attention-based fusion module at multiple scales. Extensive experimental results verify that the proposed scheme achieves the state-of-the-art generalization ability and robustness in various benchmark datasets. Code will be public at https://github.com/multimediaFor/EITLNet.
翻译:强大的篡改技术使得数字图像伪造易于创建且广泛传播,而不会留下视觉异常。被篡改区域的盲定位对于图像取证具有重要价值。本文提出一种基于双分支增强型Transformer编码器与注意力特征融合的高效图像篡改定位网络(EITLNet)。具体而言,设计了一种特征增强模块用于提升Transformer编码器的特征表示能力。通过基于坐标注意力的多尺度融合模块,可有效融合RGB流与噪声流提取的特征。大量实验结果表明,所提方案在多个基准数据集上达到了最先进的泛化能力和鲁棒性。代码将于https://github.com/multimediaFor/EITLNet 公开。