With diverse presentation forgery methods emerging continually, detecting the authenticity of images has drawn growing attention. Although existing methods have achieved impressive accuracy in training dataset detection, they still perform poorly in the unseen domain and suffer from forgery of irrelevant information such as background and identity, affecting generalizability. To solve this problem, we proposed a novel framework Selective Domain-Invariant Feature (SDIF), which reduces the sensitivity to face forgery by fusing content features and styles. Specifically, we first use a Farthest-Point Sampling (FPS) training strategy to construct a task-relevant style sample representation space for fusing with content features. Then, we propose a dynamic feature extraction module to generate features with diverse styles to improve the performance and effectiveness of the feature extractor. Finally, a domain separation strategy is used to retain domain-related features to help distinguish between real and fake faces. Both qualitative and quantitative results in existing benchmarks and proposals demonstrate the effectiveness of our approach.
翻译:随着各种呈现伪造方法的不断涌现,图像真实性检测日益受到关注。尽管现有方法在训练数据集检测中取得了令人瞩目的准确率,但在未见领域中的表现仍然欠佳,且容易受到背景、身份等无关信息的伪造影响,导致泛化性能不足。为解决这一问题,我们提出了一种新颖的框架——选择性域不变特征(Selective Domain-Invariant Feature, SDIF),通过融合内容特征与风格特征来降低对人脸伪造的敏感性。具体而言,我们首先采用最远点采样(Farthest-Point Sampling, FPS)训练策略构建与任务相关的风格样本表示空间,用于与内容特征进行融合。随后,我们提出动态特征提取模块以生成具有多样风格的特征,从而提升特征提取器的性能与效果。最后,利用域分离策略保留与域相关的特征,以辅助区分真实与伪造人脸。在现有基准和提案上的定性与定量结果均证明了我们方法的有效性。