Network moments--rescaled counts of motifs such as stars and triangles--are fundamental summaries of network structure, widely used in goodness-of-fit testing, model selection, and network comparison. While the univariate distribution of a single network moment can be approximated by subsampling, the consistency of subsampling for their {\it joint} distribution has remained unestablished. In this paper, we prove that node subsampling provides an asymptotically accurate approximation of the joint distribution of multiple network moments under a general sparse graphon model. The theoretical analysis requires a careful characterization of the dependence structure among network moments and the corresponding multivariate asymptotic convergence, going substantially beyond existing univariate results. Building on this foundation, we address a practically important open problem: two-sample testing between unmatchable networks with unequal edge densities. We propose a novel subsampling-based procedure that combines {\it sparsification} with a {\it sample-splitting} strategy. This yields the first subsampling-based inferential procedure valid for this setting, to our knowledge. We demonstrate the utility of multivariate subsampling inference through simulation studies and a real data application comparing coexpression networks of core and non-core genes in a study of parallel adaptation in Trinidadian guppies, where joint and conditional moment distributions reveal a structural difference that no marginal test can detect.
翻译:网络矩——对星形、三角形等模体进行重标度计数——是网络结构的基本概括量,广泛用于拟合优度检验、模型选择和网络比较。虽然单一网络矩的单变量分布可通过子抽样近似,但子抽样对其联合分布的一致性仍未被建立。本文证明,在一般稀疏图模型下,节点子抽样能够为多个网络矩的联合分布提供渐近精确的近似。该理论分析需要对网络矩间的依赖结构及相应多元渐近收敛性进行精细刻画,这实质上超越了现有的单变量结果。基于此基础,我们解决了一个具有重要实践意义的开放性问题:具有不等边密度的不可匹配网络间的双样本检验。我们提出一种新颖的基于子抽样的程序,该程序结合了稀疏化与样本分裂策略。据我们所知,这首次为此场景提供了有效的基于子抽样的推断方法。通过模拟研究及一项特立尼达孔雀鱼平行适应研究中核心基因与非核心基因共表达网络比较的真实数据应用,我们展示了多元子抽样推断的实用性——其中联合及条件矩分布揭示了任何边际检验均无法检测的结构差异。