This paper studies the information theoretic secure aggregation problem in a three-layer hierarchical network with arbitrary heterogeneous data assignment, where clustered users communicate with an aggregation server through an intermediate layer of relays. We consider a more general setting with arbitrary heterogeneous data assignment across users, where `arbitrary' means that the data assignment is given in advance and `heterogeneous' means that the users may hold different numbers of datasets. Each user locally computes the partially aggregated gradients as its input based on the assigned datasets and transmits masked input to its associated relay. The relays then forward the aggregated messages to the server, which aims to recover the sum of the gradients. In this process, while some users may drop out unpredictably, the server needs to correctly recover the desired aggregation from the surviving users. Moreover, the server or any relay may collude with a subset of users. We impose the following security constraints: (i) server security, requiring the server to learn only the sum of gradients without gaining any additional information about individual inputs; and (ii) relay security, ensuring that each relay learns nothing about users' inputs. Under these constraints, we propose an aggregation scheme that guarantees information theoretic security and achieves the optimal two-layer communication loads.
翻译:本文研究了三层层次化网络中任意异构数据分配下的信息论安全聚合问题,其中簇内用户通过中继层与聚合服务器通信。我们考虑更通用的场景:用户间存在任意异构数据分配——"任意"指数据分配预先给定,"异构"指不同用户可能持有不同数量的数据集。每个用户基于分配的数据集计算部分聚合梯度作为其输入,并将掩码输入传输至关联中继。中继随后将聚合消息转发至服务器,服务器旨在恢复梯度总和。在此过程中,部分用户可能不可预测地掉线,服务器需从存活用户中正确恢复所需聚合结果。此外,服务器或任意中继可能与部分用户合谋。我们设置以下安全约束:(i) 服务器安全——要求服务器仅能获知梯度总和,无法获取任何个体输入的额外信息;(ii) 中继安全——确保每个中继无法获知用户输入的任何信息。在此约束下,我们提出了一种保证信息论安全性且实现最优两层通信负载的聚合方案。