We focus on the problem of generating high-quality, private synthetic glucose traces, a task generalizable to many other time series sources. Existing methods for time series data synthesis, such as those using Generative Adversarial Networks (GANs), are not able to capture the innate characteristics of glucose data and cannot provide any formal privacy guarantees without severely degrading the utility of the synthetic data. In this paper we present GlucoSynth, a novel privacy-preserving GAN framework to generate synthetic glucose traces. The core intuition behind our approach is to conserve relationships amongst motifs (glucose events) within the traces, in addition to temporal dynamics. Our framework incorporates differential privacy mechanisms to provide strong formal privacy guarantees. We provide a comprehensive evaluation on the real-world utility of the data using 1.2 million glucose traces; GlucoSynth outperforms all previous methods in its ability to generate high-quality synthetic glucose traces with strong privacy guarantees.
翻译:本文聚焦于生成高质量、隐私保护的合成血糖轨迹问题,该任务可推广至多种其他时间序列数据源。现有时间序列数据合成方法(如基于生成对抗网络的方法)无法捕捉血糖数据的内在特征,且在不严重降低合成数据效用的情况下无法提供任何形式隐私保证。本文提出GlucoSynth——一种新颖的隐私保护生成对抗网络框架,用于生成合成血糖轨迹。该方法的核心思想是在保留轨迹中基序(血糖事件)间关系的同时,兼顾时间动态特性。该框架融合差分隐私机制,提供强形式隐私保证。我们利用120万条血糖轨迹对其实用性进行综合评估;结果表明,GlucoSynth在生成兼具强隐私保证与高质量特性的合成血糖轨迹方面,超越现有所有方法。