Topology-Based Reconstruction Prevention for Decentralised Learning

Decentralised learning has recently gained traction as an alternative to federated learning in which both data and coordination are distributed over its users. To preserve data confidentiality, decentralised learning relies on differential privacy, multi-party computation, or a combination thereof. However, running multiple privacy-preserving summations in sequence may allow adversaries to perform reconstruction attacks. Unfortunately, current reconstruction countermeasures either cannot trivially be adapted to the distributed setting, or add excessive amounts of noise. In this work, we first show that passive honest-but-curious adversaries can infer other users' private data after several privacy-preserving summations. For example, in subgraphs with 18 users, we show that only three passive honest-but-curious adversaries succeed at reconstructing private data 11.0% of the time, requiring an average of 8.8 summations per adversary. The success rate depends only on the adversaries' direct neighbourhood, independent of the size of the full network. We consider weak adversaries, who do not control the graph topology and can exploit neither the inner workings of the summation protocol nor the specifics of users' data. We develop a mathematical understanding of how reconstruction relates to topology and propose the first topology-based decentralised defence against reconstruction attacks. Specifically, we show that reconstruction requires a number of adversaries linear in the length of the network's shortest cycle. Consequently, reconstructing private data from privacy-preserving summations is impossible in acyclic networks. Our work is a stepping stone for a formal theory of topology-based reconstruction defences. Such a theory would generalise our countermeasure beyond summation, define confidentiality in terms of entropy, and describe the effects of differential privacy.

翻译：去中心化学习近期作为联邦学习的替代方案备受关注，其数据与协调机制均分布在用户节点中。为保障数据机密性，去中心化学习依赖差分隐私、多方计算或二者结合。然而，序列执行多个隐私保护求和操作可能使攻击者实施重建攻击。现有重建防御手段要么难以直接适配分布式场景，要么需要添加过量噪声。本研究首先证明：在被动诚实但好奇的 adversaries（攻击者）经过多次隐私保护求和操作后，可推断其他用户的私有数据。例如，在包含18个用户的子图中，仅需3个被动诚实但好奇的 adversaries 即可在11.0%的概率下成功重建私有数据，每个 adversary 平均需要8.8次求和操作。该成功率仅取决于 adversaries 的直接邻域，与全网络规模无关。我们假设 adversaries 为弱攻击者——既不控制图拓扑结构，也无法利用求和协议的内部机制或用户数据的特定属性。通过建立重建攻击与网络拓扑的数学关联，首次提出基于拓扑的去中心化重建防御方案。具体而言，我们证明重建攻击需要 adversaries 数量与网络最短环长度呈线性关系。因此，在无环网络中通过隐私保护求和重建私有数据是不可能的。本研究为建立基于拓扑的重建防御形式化理论奠定了基础——该理论将把防御措施扩展到求和协议之外，用信息熵定义机密性，并描述差分隐私的影响效应。