This paper develops a new spectral clustering-based method called TransNet for transfer learning in community detection of network data. Our goal is to improve the clustering performance of the target network using auxiliary source networks, which are heterogeneous, privacy-preserved, and locally stored across various sources. The edges of each locally stored network are perturbed using the randomized response mechanism to achieve differential privacy. Notably, we allow the source networks to have distinct privacy-preserving and heterogeneity levels as often desired in practice. To better utilize the information from the source networks, we propose a novel adaptive weighting method to aggregate the eigenspaces of the source networks multiplied by adaptive weights chosen to incorporate the effects of privacy and heterogeneity. We propose a regularization method that combines the weighted average eigenspace of the source networks with the eigenspace of the target network to achieve an optimal balance between them. Theoretically, we show that the adaptive weighting method enjoys the error-bound-oracle property in the sense that the error bound of the estimated eigenspace only depends on informative source networks. We also demonstrate that TransNet performs better than the estimator using only the target network and the estimator using only the weighted source networks.
翻译:本文提出了一种名为TransNet的新型基于谱聚类的迁移学习方法,用于网络数据的社区检测。我们的目标是通过利用辅助源网络来提升目标网络的聚类性能,这些源网络具有异构性、隐私保护特性,并分散存储于不同本地节点。每个本地存储网络的边通过随机响应机制进行扰动以实现差分隐私。值得注意的是,我们允许源网络具有不同的隐私保护水平和异构程度,这更符合实际应用需求。为更有效地利用源网络信息,我们提出了一种新颖的自适应加权方法,通过选取能融合隐私与异构效应的自适应权重,对源网络特征空间进行加权聚合。我们进一步提出一种正则化方法,将源网络的加权平均特征空间与目标网络特征空间相结合,以实现两者间的最优平衡。理论上,我们证明了自适应加权方法具有误差界预言性质,即估计特征空间的误差界仅依赖于信息量充足的源网络。实验结果表明,TransNet的性能优于仅使用目标网络的估计器以及仅使用加权源网络的估计器。