In social networks, the discovery of community structures has received considerable attention as a fundamental problem in various network analysis tasks. However, due to privacy concerns or access restrictions, the network structure is often uncertain, thereby rendering established community detection approaches ineffective without costly network topology acquisition. To tackle this challenge, we present META-CODE, a unified framework for detecting overlapping communities via exploratory learning aided by easy-to-collect node metadata when networks are topologically unknown (or only partially known). Specifically, META-CODE consists of three iterative steps in addition to the initial network inference step: 1) node-level community-affiliation embeddings based on graph neural networks (GNNs) trained by our new reconstruction loss, 2) network exploration via community-affiliation-based node queries, and 3) network inference using an edge connectivity-based Siamese neural network model from the explored network. Through extensive experiments on five real-world datasets including two large networks, we demonstrated: (a) the superiority of META-CODE over benchmark community detection methods, achieving remarkable gains up to 151.27% compared to the best existing competitor, (b) the impact of each module in META-CODE, (c) the effectiveness of node queries in META-CODE based on empirical evaluations and theoretical findings, (d) the convergence of the inferred network, and (e) the computational efficiency of META-CODE.
翻译:在社交网络中,社区结构的发现作为各类网络分析任务的基础问题已受到广泛关注。然而,由于隐私顾虑或访问限制,网络结构往往存在不确定性,导致现有社区检测方法在缺乏昂贵的网络拓扑采集时效果不佳。为应对这一挑战,我们提出META-CODE——一个在拓扑未知(或部分已知)网络条件下,通过易于收集的节点元数据辅助探索性学习,实现重叠社区检测的统一框架。具体而言,除初始网络推断步骤外,META-CODE包含三个迭代环节:1)基于图神经网络(GNN)的节点级社区隶属嵌入,该网络由我们提出的新型重构损失训练;2)通过基于社区隶属的节点查询进行网络探索;3)利用已探索网络的边连通性孪生神经网络模型进行网络推断。通过在五个真实数据集(含两个大规模网络)上的广泛实验,我们验证了:(a)META-CODE相较于基准社区检测方法的优越性,相比最优现有方法可实现高达151.27%的性能提升;(b)META-CODE中各模块的影响;(c)基于经验评估与理论发现,节点查询在META-CODE中的有效性;(d)推断网络的收敛性;(e)META-CODE的计算效率。