Smartwatch health sensor data are increasingly utilized in smart health applications and patient monitoring, including stress detection. However, such medical data often comprise sensitive personal information and are resource-intensive to acquire for research purposes. In response to this challenge, we introduce the privacy-aware synthetization of multi-sensor smartwatch health readings related to moments of stress, employing Generative Adversarial Networks (GANs) and Differential Privacy (DP) safeguards. Our method not only protects patient information but also enhances data availability for research. To ensure its usefulness, we test synthetic data from multiple GANs and employ different data enhancement strategies on an actual stress detection task. Our GAN-based augmentation methods demonstrate significant improvements in model performance, with private DP training scenarios observing an 11.90-15.48% increase in F1-score, while non-private training scenarios still see a 0.45% boost. These results underline the potential of differentially private synthetic data in optimizing utility-privacy trade-offs, especially with the limited availability of real training samples. Through rigorous quality assessments, we confirm the integrity and plausibility of our synthetic data, which, however, are significantly impacted when increasing privacy requirements.
翻译:智能手表健康传感器数据日益广泛应用于智能健康应用和患者监测,包括压力检测。然而,此类医疗数据通常包含敏感个人信息,且获取成本高昂,难以用于研究目的。针对这一挑战,我们提出了一种隐私感知的多传感器智能手表健康读数合成方法,该方法涉及压力时刻相关数据,采用生成对抗网络(GANs)和差分隐私(DP)保护机制。我们的方法不仅保护患者信息,还提升了数据的可用性以支持研究。为确保其有效性,我们在实际压力检测任务中测试了来自多个GANs的合成数据,并采用了不同的数据增强策略。基于GAN的数据增强方法在模型性能上表现出显著提升:在私有DP训练场景下,F1分数提高了11.90%-15.48%,而在非私有训练场景下也实现了0.45%的提升。这些结果突显了差分隐私合成数据在优化效用-隐私权衡方面的潜力,尤其是在真实训练样本有限的情况下。通过严格的质量评估,我们确认了合成数据的完整性和合理性,但发现随着隐私要求的增加,这些数据质量受到显著影响。