Graph Neural Networks (GNNs) have improved unsupervised community detection of clustered nodes due to their ability to encode the dual dimensionality of the connectivity and feature information spaces of graphs. Identifying the latent communities has many practical applications from social networks to genomics. Current benchmarks of real world performance are confusing due to the variety of decisions influencing the evaluation of GNNs at this task. To address this, we propose a framework to establish a common evaluation protocol. We motivate and justify it by demonstrating the differences with and without the protocol. The W Randomness Coefficient is a metric proposed for assessing the consistency of algorithm rankings to quantify the reliability of results under the presence of randomness. We find that by ensuring the same evaluation criteria is followed, there may be significant differences from the reported performance of methods at this task, but a more complete evaluation and comparison of methods is possible.
翻译:图神经网络(GNN)由于能够编码图的连接性和特征信息空间的双重维度,在无监督聚类节点社区检测中取得了改进。识别潜在社区具有从社交网络到基因组学的诸多实际应用。由于影响GNN在此任务中评估的决策多样性,当前实际性能基准令人困惑。为此,我们提出一个框架以建立共同评估协议。通过演示有无该协议所产生的差异,我们对其进行了动机阐述和合理性论证。W随机性系数是一种用于评估算法排名一致性的指标,可量化存在随机性条件下结果的可靠性。研究发现,确保遵循相同评估准则后,方法在此任务上的报告性能可能存在显著差异,但能够实现更全面的方法评估与比较。