An important question in statistical network analysis is how to estimate models of discrete and dependent network data with intractable likelihood functions, without sacrificing computational scalability and statistical guarantees. We demonstrate that scalable estimation of random graph models with dependent edges is possible, by establishing convergence rates of pseudo-likelihood-based $M$-estimators for discrete undirected graphical models with exponential parameterizations and parameter vectors of increasing dimension in single-observation scenarios. We highlight the impact of two complex phenomena on the convergence rate: phase transitions and model near-degeneracy. The main results have possible applications to discrete and dependent network, spatial, and temporal data. To showcase convergence rates, we introduce a novel class of generalized $\beta$-models with dependent edges and parameter vectors of increasing dimension, which leverage additional structure in the form of overlapping subpopulations to control dependence. We establish convergence rates of pseudo-likelihood-based $M$-estimators for generalized $\beta$-models in dense- and sparse-graph settings.
翻译:统计网络分析中的一个重要问题是如何在保持计算可扩展性和统计保证的前提下,估计具有离散相依网络数据且似然函数难处理的模型。我们证明,通过建立单观测场景下指数参数化离散无向图模型基于伪似然$M$估计量的收敛速率(其中参数向量维数递增),可以实现对具有相依边的随机图模型的可扩展估计。我们重点揭示了两种复杂现象对收敛速率的影响:相变和模型近退化性。主要结果可应用于离散相依网络、空间和时间数据。为展示收敛速率,我们引入了一类新型广义$\beta$模型,该模型具有相依边和递增维参数向量,通过利用重叠子种群这一额外结构来控制相依性。我们建立了稠密图和稀疏图场景下广义$\beta$模型基于伪似然$M$估计量的收敛速率。