We focus on the problem of generating high-quality, private synthetic glucose traces, a task generalizable to many other time series sources. Existing methods for time series data synthesis, such as those using Generative Adversarial Networks (GANs), are not able to capture the innate characteristics of glucose data and cannot provide any formal privacy guarantees without severely degrading the utility of the synthetic data. In this paper we present GlucoSynth, a novel privacy-preserving GAN framework to generate synthetic glucose traces. The core intuition behind our approach is to conserve relationships amongst motifs (glucose events) within the traces, in addition to temporal dynamics. Our framework incorporates differential privacy mechanisms to provide strong formal privacy guarantees. We provide a comprehensive evaluation on the real-world utility of the data using 1.2 million glucose traces; GlucoSynth outperforms all previous methods in its ability to generate high-quality synthetic glucose traces with strong privacy guarantees.
翻译:我们聚焦于生成高质量、私有的合成血糖轨迹问题,该任务可推广至众多其他时间序列数据源。现有时间序列数据合成方法(如基于生成对抗网络的方法)无法捕捉血糖数据的内在特征,且若不对合成数据的效用造成严重损害,便无法提供形式化隐私保障。本文提出GlucoSynth——一种新型隐私保护生成对抗网络框架,用于生成合成血糖轨迹。该方法的核心思路是:在保留时间动态特征的同时,维护轨迹中基序(血糖事件)间的关联关系。该框架集成差分隐私机制,可提供强形式化隐私保障。我们利用120万条血糖轨迹数据对其真实世界效用进行全面评估;结果表明,GlucoSynth在生成具有强隐私保障的高质量合成血糖轨迹方面优于所有现有方法。