L-2 Regularized maximum likelihood for $β$-model in large and sparse networks

The $\beta$-model is a powerful tool for modeling network generation driven by degree heterogeneity. Its simple yet expressive nature particularly well-suits large and sparse networks, where many network models become infeasible due to computational challenge and observation scarcity. However, existing estimation algorithms for $\beta$-model do not scale up; and theoretical understandings remain limited to dense networks. This paper brings several significant improvements to the method and theory of $\beta$-model to address urgent needs of practical applications. Our contributions include: 1. method: we propose a new $\ell_2$ penalized MLE scheme; we design a novel fast algorithm that can comfortably handle sparse networks of millions of nodes, much faster and more memory-parsimonious than all existing algorithms; 2. theory: we present new error bounds on $\beta$-models under much weaker assumptions than best known results in literature; we also establish new lower-bounds and new asymptotic normality results; under proper parameter sparsity assumptions, we show the first local rate-optimality result in $\ell_2$ norm; distinct from existing literature, our results cover both small and large regularization scenarios and reveal their distinct asymptotic dependency structures; 3. application: we apply our method to large COVID-19 network data sets and discover meaningful results.

翻译：β-模型是一种受度异质性驱动的网络生成建模的强大工具。其简洁而富有表现力的特性特别适用于大型稀疏网络——在此类场景中，许多网络模型因计算挑战和观测稀疏性而难以实施。然而，现有β-模型估计算法无法扩展，且理论理解仍局限于稠密网络。本文从方法与理论两方面对β-模型进行了若干重要改进，以应对实际应用的迫切需求。我们的贡献包括：1.方法层面：提出一种新的ℓ2惩罚最大似然估计方案；设计了一种新颖的快速算法，可轻松处理包含数百万节点的稀疏网络，比所有现有算法更快且内存消耗更少；2.理论层面：在远弱于现有文献最优结果的前提下，给出了β-模型的新误差界；同时建立了新的下界与渐近正态性结果；在适当的参数稀疏性假设下，首次给出了ℓ2范数下的局部速率最优性结果；与现有文献不同，我们的结果同时覆盖小正则化与大正则化场景，并揭示了它们不同的渐近依赖结构；3.应用层面：将所提方法应用于大型COVID-19网络数据集，发现了有意义的结果。

相关内容

Networking

关注 23

Networking：IFIP International Conferences on Networking。 Explanation：国际网络会议。 Publisher：IFIP。 SIT： http://dblp.uni-trier.de/db/conf/networking/index.html

【干货书】数据分析优化，Optimization for Modern Data Analysis，117页pdf

专知会员服务

66+阅读 · 2023年2月15日

不可错过！《机器学习100讲》课程，UBC Mark Schmidt讲授

专知会员服务

76+阅读 · 2022年6月28日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日