Privacy Drift: Evolving Privacy Concerns in Incremental Learning

In the evolving landscape of machine learning (ML), Federated Learning (FL) presents a paradigm shift towards decentralized model training while preserving user data privacy. This paper introduces the concept of ``privacy drift", an innovative framework that parallels the well-known phenomenon of concept drift. While concept drift addresses the variability in model accuracy over time due to changes in the data, privacy drift encapsulates the variation in the leakage of private information as models undergo incremental training. By defining and examining privacy drift, this study aims to unveil the nuanced relationship between the evolution of model performance and the integrity of data privacy. Through rigorous experimentation, we investigate the dynamics of privacy drift in FL systems, focusing on how model updates and data distribution shifts influence the susceptibility of models to privacy attacks, such as membership inference attacks (MIA). Our results highlight a complex interplay between model accuracy and privacy safeguards, revealing that enhancements in model performance can lead to increased privacy risks. We provide empirical evidence from experiments on customized datasets derived from CIFAR-100 (Canadian Institute for Advanced Research, 100 classes), showcasing the impact of data and concept drift on privacy. This work lays the groundwork for future research on privacy-aware machine learning, aiming to achieve a delicate balance between model accuracy and data privacy in decentralized environments.

翻译：在机器学习的演进格局中，联邦学习提出了一种去中心化模型训练的新范式，同时保护用户数据隐私。本文引入了"隐私漂移"的概念，这是一个与广为人知的概念漂移现象相平行的创新框架。概念漂移处理的是由于数据变化导致的模型准确性随时间变化的问题，而隐私漂移则捕捉了模型在增量训练过程中隐私信息泄露程度的变化。通过定义和考察隐私漂移，本研究旨在揭示模型性能演变与数据隐私完整性之间的微妙关系。通过严谨的实验，我们研究了联邦学习系统中隐私漂移的动态特性，重点关注模型更新和数据分布变化如何影响模型对隐私攻击（如成员推理攻击）的敏感性。我们的结果突显了模型准确性与隐私保护之间复杂的相互作用，表明模型性能的提升可能导致隐私风险的增加。我们基于CIFAR-100（加拿大高级研究所，100个类别）衍生的定制数据集提供了实验证据，展示了数据和概念漂移对隐私的影响。这项工作为未来隐私感知机器学习研究奠定了基础，旨在去中心化环境中实现模型准确性与数据隐私之间的微妙平衡。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日