The primary promise of decentralized learning is to allow users to engage in the training of machine learning models in a collaborative manner while keeping their data on their premises and without relying on any central entity. However, this paradigm necessitates the exchange of model parameters or gradients between peers. Such exchanges can be exploited to infer sensitive information about training data, which is achieved through privacy attacks (e.g Membership Inference Attacks -- MIA). In order to devise effective defense mechanisms, it is important to understand the factors that increase/reduce the vulnerability of a given decentralized learning architecture to MIA. In this study, we extensively explore the vulnerability to MIA of various decentralized learning architectures by varying the graph structure (e.g number of neighbors), the graph dynamics, and the aggregation strategy, across diverse datasets and data distributions. Our key finding, which to the best of our knowledge we are the first to report, is that the vulnerability to MIA is heavily correlated to (i) the local model mixing strategy performed by each node upon reception of models from neighboring nodes and (ii) the global mixing properties of the communication graph. We illustrate these results experimentally using four datasets and by theoretically analyzing the mixing properties of various decentralized architectures. Our paper draws a set of lessons learned for devising decentralized learning systems that reduce by design the vulnerability to MIA.
翻译:去中心化学习的主要承诺在于使用户能够以协作方式参与机器学习模型的训练,同时将其数据保留在本地,且无需依赖任何中心实体。然而,这种范式需要在节点间交换模型参数或梯度。此类交换可能被利用来推断训练数据中的敏感信息,这通常通过隐私攻击(例如成员推理攻击——MIA)实现。为了设计有效的防御机制,理解增加或降低特定去中心化学习架构对MIA脆弱性的因素至关重要。在本研究中,我们通过改变图结构(例如邻居数量)、图动态和聚合策略,并在多种数据集和数据分布下,广泛探索了不同去中心化学习架构对MIA的脆弱性。我们的关键发现(据我们所知是首次报告)是:对MIA的脆弱性与(i)每个节点在接收到邻居节点模型后执行的本地模型混合策略,以及(ii)通信图的全局混合特性高度相关。我们通过使用四个数据集的实验以及理论分析多种去中心化架构的混合特性,对这些结果进行了实证说明。本文总结了一系列经验教训,旨在指导设计能够从架构层面降低对MIA脆弱性的去中心化学习系统。