Large-scale networks are commonly encountered in practice (e.g., Facebook and Twitter) by researchers. In order to study the network interaction between different nodes of large-scale networks, the spatial autoregressive (SAR) model has been popularly employed. Despite its popularity, the estimation of a SAR model on large-scale networks remains very challenging. On the one hand, due to policy limitations or high collection costs, it is often impossible for independent researchers to observe or collect all network information. On the other hand, even if the entire network is accessible, estimating the SAR model using the quasi-maximum likelihood estimator (QMLE) could be computationally infeasible due to its high computational cost. To address these challenges, we propose here a subnetwork estimation method based on QMLE for the SAR model. By using appropriate sampling methods, a subnetwork, consisting of a much-reduced number of nodes, can be constructed. Subsequently, the standard QMLE can be computed by treating the sampled subnetwork as if it were the entire network. This leads to a significant reduction in information collection and model computation costs, which increases the practical feasibility of the effort. Theoretically, we show that the subnetwork-based QMLE is consistent and asymptotically normal under appropriate regularity conditions. Extensive simulation studies, based on both simulated and real network structures, are presented.
翻译:大规模网络(如Facebook和Twitter)在实际研究中普遍存在。为了分析大规模网络中不同节点间的网络交互作用,空间自回归(SAR)模型被广泛应用。然而,尽管该模型应用广泛,在大规模网络上进行SAR模型估计仍面临巨大挑战。一方面,受政策限制或高昂数据采集成本的影响,独立研究者往往无法观测或收集完整的网络信息;另一方面,即便能够获取完整网络,使用拟极大似然估计法(QMLE)估计SAR模型也可能因高昂的计算成本而难以实现。为应对这些挑战,本文提出了一种基于QMLE的SAR模型子网络估计方法。通过采用适当的采样方法,可以构建包含节点数量大幅减少的子网络。随后,将采样子网络视为完整网络进行标准QMLE计算。该方法显著降低了信息收集与模型计算成本,从而提升了实际应用可行性。在理论层面,我们证明在适当的正则条件下,基于子网络的QMLE具有相合性和渐近正态性。基于模拟网络结构与实际网络结构的广泛仿真研究结果也已给出。