Modern network analysis often involves multi-layer network data in which the nodes are aligned, and the edges on each layer represent one of the multiple relations among the nodes. Current literature on multi-layer network data is mostly limited to undirected relations. However, direct relations are more common and may introduce extra information. In this paper, we study the community detection (or clustering) in multi-layer directed networks. To take into account the asymmetry, we develop a novel spectral-co-clustering-based algorithm to detect co-clusters, which capture the sending patterns and receiving patterns of nodes, respectively. Specifically, we compute the eigen-decomposition of the debiased sum of Gram matrices over the layer-wise adjacency matrices, followed by the k-means, where the sum of Gram matrices is used to avoid possible cancellation of clusters caused by direct summation. We provide theoretical analysis of the algorithm under the multi-layer stochastic co-block model, where we relax the common assumption that the cluster number is coupled with the rank of the model. After a systematic analysis of the eigen-vectors of population version algorithm, we derive the misclassification rates which show that multi-layers would bring benefit to the clustering performance. The experimental results of simulated data corroborate the theoretical predictions, and the analysis of a real-world trade network dataset provides interpretable results.
翻译:现代网络分析常涉及多层网络数据,其中节点对齐,每层边代表节点间的多种关系之一。当前多层网络数据的文献大多局限于无向关系,但有向关系更为普遍且可能引入额外信息。本文研究多层有向网络中的社区检测(即聚类)问题。为考虑非对称性,我们提出一种基于谱协聚类的新算法来检测协簇,分别捕捉节点的发送模式和接收模式。具体而言,我们计算逐层邻接矩阵的Gram矩阵去偏和的特征分解,随后执行k-means聚类,其中使用Gram矩阵和以避免直接求和可能导致的簇抵消。我们在多层随机协块模型下对该算法进行理论分析,放松了簇数与模型秩耦合的常见假设。通过对总体版本算法特征向量的系统分析,我们推导出误分类率,表明多层结构有助于提升聚类性能。模拟数据的实验结果验证了理论预测,对真实贸易网络数据集的分析提供了可解释的结论。