Gaussian process are a widely-used statistical tool for conducting non-parametric inference in applied sciences, with many computational packages available to fit to data and predict future observations. We study the use of the Greta software for Bayesian inference to apply Gaussian process regression to spatio-temporal data of infectious disease outbreaks and predict future disease spread. Greta builds on Tensorflow, making it comparatively easy to take advantage of the significant gain in speed offered by GPUs. In these complex spatio-temporal models, we show a reduction of up to 70\% in computational time relative to fitting the same models on CPUs. We show how the choice of covariance kernel impacts the ability to infer spread and extrapolate to unobserved spatial and temporal units. The inference pipeline is applied to weekly incidence data on tuberculosis in the East and West Midlands regions of England over a period of two years.
翻译:高斯过程是应用科学中进行非参数推断的常用统计工具,现有多种计算软件包可将其拟合至数据并预测未来观测值。本研究探讨利用Greta软件进行贝叶斯推断,将高斯过程回归应用于传染病暴发的时空数据,以预测未来疾病传播趋势。Greta基于TensorFlow构建,能相对便捷地利用GPU带来的显著速度提升。在这些复杂的时空模型中,相较于在CPU上拟合相同模型,我们实现了高达70%的计算时间缩减。本文论证了协方差核函数的选择如何影响传播推断能力及对未观测时空单元的预测效果。该推断流程应用于英格兰东、西米德兰兹地区两年期间的结核病周发病率数据。