Many two-sample network hypothesis testing methodologies operate under the implicit assumption that the vertex correspondence across networks is a priori known. In this paper, we consider the degradation of power in two-sample graph hypothesis testing when there are misaligned/label-shuffled vertices across networks. In the context of random dot product and stochastic block model networks, we theoretically explore the power loss due to shuffling for a pair of hypothesis tests based on Frobenius norm differences between estimated edge probability matrices or between adjacency matrices. The loss in testing power is further reinforced by numerous simulations and experiments, both in the stochastic block model and in the random dot product graph model, where we compare the power loss across multiple recently proposed tests in the literature. Lastly, we demonstrate the impact that shuffling can have in real-data testing in a pair of examples from neuroscience and from social network analysis.
翻译:许多双样本网络假设检验方法隐式假定跨网络的顶点对应关系是先验已知的。本文研究了当网络间存在错位/标签混乱的顶点时,双样本图假设检验中检验效能的退化现象。在随机点积图模型与随机块模型网络框架下,我们理论探讨了基于估计边概率矩阵差异的Frobenius范数以及基于邻接矩阵差异的Frobenius范数的两类假设检验因标签混乱导致的效能损失。通过大量数值模拟与实验(涵盖随机块模型与随机点积图模型),我们进一步证实了检验效能损失的存在,并与文献中近期提出的多种检验方法的效能损失进行了对比。最后,我们通过神经科学与社交网络分析领域的两个实例,展示了标签混乱对实际数据检验的影响。