Network data is prevalent in numerous big data applications including economics and health networks where it is of prime importance to understand the latent structure of network. In this paper, we model the network using the Degree-Corrected Mixed Membership (DCMM) model. In DCMM model, for each node $i$, there exists a membership vector $\boldsymbol{\pi}_ i = (\boldsymbol{\pi}_i(1), \boldsymbol{\pi}_i(2),\ldots, \boldsymbol{\pi}_i(K))$, where $\boldsymbol{\pi}_i(k)$ denotes the weight that node $i$ puts in community $k$. We derive novel finite-sample expansion for the $\boldsymbol{\pi}_i(k)$s which allows us to obtain asymptotic distributions and confidence interval of the membership mixing probabilities and other related population quantities. This fills an important gap on uncertainty quantification on the membership profile. We further develop a ranking scheme of the vertices based on the membership mixing probabilities on certain communities and perform relevant statistical inferences. A multiplier bootstrap method is proposed for ranking inference of individual member's profile with respect to a given community. The validity of our theoretical results is further demonstrated by via numerical experiments in both real and synthetic data examples.
翻译:网络数据在众多大数据应用中普遍存在,包括经济学和健康网络,其中理解网络的潜在结构至关重要。本文采用度校正混合成员(DCMM)模型对网络进行建模。在DCMM模型中,每个节点$i$存在一个成员向量$\boldsymbol{\pi}_ i = (\boldsymbol{\pi}_i(1), \boldsymbol{\pi}_i(2),\ldots, \boldsymbol{\pi}_i(K))$,其中$\boldsymbol{\pi}_i(k)$表示节点$i$对社区$k$的隶属权重。我们推导了$\boldsymbol{\pi}_i(k)$的新颖有限样本展开,从而获得成员混合概率及其他相关总体参数的渐近分布与置信区间。这填补了成员概况不确定性量化方面的重要空白。我们进一步基于特定社区的成员混合概率开发了顶点排序方案,并执行相关统计推断。针对给定社区下个体成员概况的排序推断,提出了一种乘子自助法。通过真实与合成数据示例的数值实验,进一步验证了我们理论结果的有效性。