PGB: Benchmarking Differentially Private Synthetic Graph Generation Algorithms

Differentially private graph analysis is a powerful tool for deriving insights from diverse graph data while protecting individual information. Designing private analytic algorithms for different graph queries often requires starting from scratch. In contrast, differentially private synthetic graph generation offers a general paradigm that supports one-time generation for multiple queries. Although a rich set of differentially private graph generation algorithms has been proposed, comparing them effectively remains challenging due to various factors, including differing privacy definitions, diverse graph datasets, varied privacy requirements, and multiple utility metrics. To this end, we propose PGB (Private Graph Benchmark), a comprehensive benchmark designed to enable researchers to compare differentially private graph generation algorithms fairly. We begin by identifying four essential elements of existing works as a 4-tuple: mechanisms, graph datasets, privacy requirements, and utility metrics. We discuss principles regarding these elements to ensure the comprehensiveness of a benchmark. Next, we present a benchmark instantiation that adheres to all principles, establishing a new method to evaluate existing and newly proposed graph generation algorithms. Through extensive theoretical and empirical analysis, we gain valuable insights into the strengths and weaknesses of prior algorithms. Our results indicate that there is no universal solution for all possible cases. Finally, we provide guidelines to help researchers select appropriate mechanisms for various scenarios.

翻译：差分隐私图分析是一种强大的工具，能够在保护个体信息的同时从多样化的图数据中获取洞见。为不同的图查询设计隐私分析算法通常需要从零开始。相比之下，差分隐私合成图生成提供了一种通用范式，支持一次性生成以应对多种查询。尽管已有大量差分隐私图生成算法被提出，但由于多种因素（包括不同的隐私定义、多样的图数据集、变化的隐私需求以及多种效用度量），有效比较这些算法仍然具有挑战性。为此，我们提出了PGB（隐私图基准测试），这是一个全面的基准测试框架，旨在使研究人员能够公平地比较差分隐私图生成算法。我们首先将现有工作的核心要素归纳为一个四元组：机制、图数据集、隐私需求和效用度量。我们讨论了关于这些要素的原则，以确保基准测试的全面性。接着，我们提出了一个遵循所有原则的基准测试实例，建立了一种评估现有及新提出的图生成算法的新方法。通过广泛的理论和实证分析，我们对现有算法的优势和不足获得了有价值的见解。我们的结果表明，不存在适用于所有可能场景的通用解决方案。最后，我们提供了指导原则，以帮助研究人员为不同场景选择合适的机制。