Generative models trained with Differential Privacy (DP) are becoming increasingly prominent in the creation of synthetic data for downstream applications. Existing literature, however, primarily focuses on basic benchmarking datasets and tends to report promising results only for elementary metrics and relatively simple data distributions. In this paper, we initiate a systematic analysis of how DP generative models perform in their natural application scenarios, specifically focusing on real-world gene expression data. We conduct a comprehensive analysis of five representative DP generation methods, examining them from various angles, such as downstream utility, statistical properties, and biological plausibility. Our extensive evaluation illuminates the unique characteristics of each DP generation method, offering critical insights into the strengths and weaknesses of each approach, and uncovering intriguing possibilities for future developments. Perhaps surprisingly, our analysis reveals that most methods are capable of achieving seemingly reasonable downstream utility, according to the standard evaluation metrics considered in existing literature. Nevertheless, we find that none of the DP methods are able to accurately capture the biological characteristics of the real dataset. This observation suggests a potential over-optimistic assessment of current methodologies in this field and underscores a pressing need for future enhancements in model design.
翻译:采用差分隐私(DP)训练生成的模型在创建用于下游应用的合成数据方面日益重要。然而,现有文献主要关注基础基准数据集,并倾向于仅报告基本指标及相对简单数据分布下的乐观结果。本文首次系统分析差分隐私生成模型在其自然应用场景中的表现,重点关注真实世界的基因表达数据。我们对五种代表性差分隐私生成方法进行了全面分析,从下游效用、统计特性及生物合理性等多个角度进行考察。广泛的评估揭示了每种差分隐私生成方法的独特特征,提供了对各类方法优缺点的关键见解,并发现了未来发展的有趣可能性。令人意外的是,根据现有文献采用的标准评估指标,我们的分析表明大多数方法能够实现看似合理的下游效用。尽管如此,我们发现没有任何差分隐私方法能够准确捕捉真实数据集的生物特征。这一观察结果暗示当前该领域的方法评估可能过于乐观,并凸显了未来模型设计的迫切改进需求。