The last two decades have seen considerable progress in foundational aspects of statistical network analysis, but the path from theory to application is not straightforward. Two large, heterogeneous samples of small networks of within-household contacts in Belgium were collected using two different but complementary sampling designs: one smaller but with all contacts in each household observed, the other larger and more representative but recording contacts of only one person per household. We wish to combine their strengths to learn the social forces that shape household contact formation and facilitate simulation for prediction of disease spread, while generalising to the population of households in the region. To accomplish this, we describe a flexible framework for specifying multi-network models in the exponential family class and identify the requirements for inference and prediction under this framework to be consistent, identifiable, and generalisable, even when data are incomplete; explore how these requirements may be violated in practice; and develop a suite of quantitative and graphical diagnostics for detecting violations and suggesting improvements to candidate models. We report on the effects of network size, geography, and household roles on household contact patterns (activity, heterogeneity in activity, and triadic closure).
翻译:过去二十年中,统计网络分析的基础理论取得了显著进展,但从理论到应用的路径并不简单。我们收集了比利时家庭内部接触的两个大规模、异质性小网络样本,采用两种不同但互补的抽样设计:一种样本量较小,但观测了每个家庭中的所有接触关系;另一种样本量更大且更具代表性,但仅记录了每户一人的人际接触。我们希望结合两者的优势,以揭示塑造家庭接触形成的社会驱动力,并为疾病传播预测的模拟提供支持,同时将结论泛化至该地区所有家庭群体。为此,我们提出了一种灵活的框架,用于在指数族分布类中定义多网络模型,并明确了在该框架下实现一致、可识别且可泛化推断与预测所需的条件(即使数据不完整);探讨了这些条件在实践中可能被违反的方式;开发了一套定量与图形诊断工具,用于检测违反情况并建议候选模型的改进方向。我们报告了网络规模、地理因素及家庭角色对家庭接触模式(活跃度、活跃度异质性及三边闭合)的影响。