The remarkable progress in neural-network-driven visual data generation, especially with neural rendering techniques like Neural Radiance Fields and 3D Gaussian splatting, offers a powerful alternative to GANs and diffusion models. These methods can produce high-fidelity images and lifelike avatars, highlighting the need for robust detection methods. In response, an unsupervised training technique is proposed that enables the model to extract comprehensive features from the Fourier spectrum magnitude, thereby overcoming the challenges of reconstructing the spectrum due to its centrosymmetric properties. By leveraging the spectral domain and dynamically combining it with spatial domain information, we create a robust multimodal detector that demonstrates superior generalization capabilities in identifying challenging synthetic images generated by the latest image synthesis techniques. To address the absence of a 3D neural rendering-based fake image database, we develop a comprehensive database that includes images generated by diverse neural rendering techniques, providing a robust foundation for evaluating and advancing detection methods.
翻译:神经网络驱动的视觉数据生成技术,特别是神经辐射场与3D高斯溅射等神经渲染技术的显著进展,为生成对抗网络和扩散模型提供了强有力的替代方案。这些方法能够生成高保真图像与逼真虚拟化身,凸显了对鲁棒检测方法的迫切需求。为此,我们提出一种无监督训练技术,使模型能够从傅里叶频谱幅度中提取全面特征,从而克服频谱因中心对称特性而难以重建的挑战。通过利用频域信息并动态结合空域信息,我们构建了一个鲁棒的多模态检测器,该检测器在识别最新图像合成技术生成的复杂合成图像方面展现出卓越的泛化能力。针对当前缺乏基于3D神经渲染的伪造图像数据库的问题,我们开发了一个包含多种神经渲染技术生成图像的综合性数据库,为评估和改进检测方法提供了坚实基础。