Synthetic contact networks are useful for modeling epidemic spread and social transmission, but data to infer realistic contact patterns that take account of assortative connections at the geographic and economic levels is limited. We developed a method to generate synthetic contact networks for any region of the United States based on publicly available data. First, we generate a synthetic population of individuals within households from US census data using combinatorial optimization. Then, individuals are assigned to workplaces and schools using commute data, employment statistics, and school enrollment data. The resulting population is then connected into a realistic contact network using graph generation algorithms. We test the method on two census regions and show that the synthetic populations accurately reflect the source data. We further show that the contact networks have distinct properties compared to networks generated without a synthetic population, and that those differences affect the rate of disease transmission in an epidemiological simulation. We provide open-source software to generate a synthetic population and contact network for any area within the US.
翻译:合成接触网络对于模拟疫情传播与社会传染过程具有重要价值,但现有数据难以支撑同时兼顾地理与经济层面同配性连接的现实接触模式推断。本研究开发了一种基于公开数据为美国任意区域生成合成接触网络的方法。首先,我们利用组合优化技术基于美国人口普查数据生成包含家庭内个体的合成人口。随后,结合通勤数据、就业统计与学校注册信息,将个体分配至工作场所与学校。最终通过图生成算法将所得人口连接成现实接触网络。我们在两个人口普查区域验证了该方法,证明合成人口能准确反映源数据特征。进一步研究表明,相较于未使用合成人口生成的网络,所得接触网络具有显著不同的拓扑特性,且这些差异会影响流行病学模拟中的疾病传播速率。我们提供了开源软件,可为美国境内任意区域生成合成人口与接触网络。