This chapter provides a self-contained introduction to the use of Bayesian inference to extract large-scale modular structures from network data, based on the stochastic blockmodel (SBM), as well as its degree-corrected and overlapping generalizations. We focus on nonparametric formulations that allow their inference in a manner that prevents overfitting, and enables model selection. We discuss aspects of the choice of priors, in particular how to avoid underfitting via increased Bayesian hierarchies, and we contrast the task of sampling network partitions from the posterior distribution with finding the single point estimate that maximizes it, while describing efficient algorithms to perform either one. We also show how inferring the SBM can be used to predict missing and spurious links, and shed light on the fundamental limitations of the detectability of modular structures in networks.
翻译:本章基于随机分块模型(SBM)及其度修正与重叠推广形式,系统介绍了如何运用贝叶斯推断从网络数据中提取大规模模块化结构。我们聚焦于非参数化建模方法,这类方法能够避免过拟合并实现模型选择。我们探讨了先验选择的若干问题,尤其关注如何通过增强贝叶斯层级结构来防止欠拟合;同时对比了从后验分布中采样网络划分与寻找最大化后验的单点估计这两种任务,并描述了实现两者的高效算法。此外,我们展示了如何通过推断SBM来预测缺失与伪关联,并揭示了网络中模块化结构可检测性的根本局限。