Graph generative models become increasingly effective for data distribution approximation and data augmentation. While they have aroused public concerns about their malicious misuses or misinformation broadcasts, just as what Deepfake visual and auditory media has been delivering to society. Hence it is essential to regulate the prevalence of generated graphs. To tackle this problem, we pioneer the formulation of the generated graph detection problem to distinguish generated graphs from real ones. We propose the first framework to systematically investigate a set of sophisticated models and their performance in four classification scenarios. Each scenario switches between seen and unseen datasets/generators during testing to get closer to real-world settings and progressively challenge the classifiers. Extensive experiments evidence that all the models are qualified for generated graph detection, with specific models having advantages in specific scenarios. Resulting from the validated generality and oblivion of the classifiers to unseen datasets/generators, we draw a safe conclusion that our solution can sustain for a decent while to curb generated graph misuses.
翻译:图生成模型在数据分布近似和数据增强方面日益有效。然而,正如深度伪造视觉和听觉媒体对社会造成的影响一样,这些模型也引发了人们对其恶意滥用或虚假信息传播的担忧。因此,规范生成图的泛滥至关重要。为解决这一问题,我们首次提出了生成图检测问题的形式化定义,旨在区分生成图与真实图。我们提出了首个系统框架,用于研究一组精密的模型在四种分类场景下的性能。每种场景在测试时交替使用可见/不可见的数据集或生成器,以贴近实际应用环境并逐步提高分类器的挑战难度。大量实验证明,所有模型均能胜任生成图检测任务,且特定模型在特定场景中具有优势。基于分类器对未见数据集或生成器具有验证过的通用性与遗忘性,我们得出一个可靠的结论:我们的解决方案能够持续有效地遏制生成图的滥用问题。