Analyzing time-series data that may contain personal information, particularly in the medical field, presents serious privacy concerns. Sensitive health data from patients is often used to train machine-learning models for diagnostics and ongoing care. Assessing the privacy risk of such models is crucial to making knowledgeable decisions on whether to use a model in production, share it with third parties, or deploy it in patients homes. Membership Inference Attacks (MIA) are a key method for this kind of evaluation, however time-series prediction models have not been thoroughly studied in this context. We explore existing MIA techniques on time-series models, and introduce new features, focusing on the seasonality and trend components of the data. Seasonality is estimated using a multivariate Fourier transform, and a low-degree polynomial is used to approximate trends. We applied these techniques to various types of time-series models, using datasets from the health domain. Our results demonstrate that these new features enhance the effectiveness of MIAs in identifying membership, improving the understanding of privacy risks in medical data applications.
翻译:分析可能包含个人信息的时间序列数据(特别是在医学领域)会引发严重的隐私关切。患者的敏感健康数据常被用于训练机器学习模型以进行诊断和持续护理。评估此类模型的隐私风险对于是否在生产环境中使用模型、与第三方共享模型或在患者家中部署模型等决策至关重要。成员推断攻击(MIA)是此类评估的关键方法,然而时间序列预测模型在此背景下尚未得到充分研究。我们探索了现有MIA技术在时间序列模型上的应用,并引入了聚焦于数据季节性和趋势成分的新特征。通过多元傅里叶变换估计季节性,采用低阶多项式近似趋势。我们将这些技术应用于多种类型的时间序列模型,并使用医疗领域的数据集进行验证。结果表明,这些新特征能有效提升MIA在成员识别方面的效能,从而深化对医疗数据应用中隐私风险的理解。