Model-based disease mapping remains a fundamental policy-informing tool in public health and disease surveillance with hierarchical Bayesian models being the current state-of-the-art approach. When working with areal data, e.g. aggregates at the administrative unit level such as district or province, routinely used models rely on the adjacency structure of areal units to account for spatial correlations. The goal of disease surveillance systems is to track disease outcomes over time, but this provides challenging in situations of crises, such as political changes, leading to changes of administrative boundaries. Kenya is an example of such country. Moreover, adjacency-based approach ignores the continuous nature of spatial processes and cannot solve the change-of-support problem, i.e. when administrative boundaries change. We present a novel, practical, and easy to implement solution relying on a methodology combining deep generative modelling and fully Bayesian inference. We build on the recent work of PriorVAE able to encode spatial priors over small areas with variational autoencoders, to map malaria prevalence in Kenya. We solve the change-of-support problem arising from Kenya changing its district boundaries in 2010. We draw realisations of the Gaussian Process (GP) prior over a fine artificial spatial grid representing continuous space and then aggregate these realisations to the level of administrative boundaries. The aggregated values are then encoded using the PriorVAE technique. The trained priors (aggVAE) are then used at the inference stage instead of the GP priors within a Markov chain Monte Carlo (MCMC) scheme. We demonstrate that it is possible to use the flexible and appropriate model for areal data based on aggregation of continuous priors, and that inference is orders of magnitude faster when using aggVAE than combining the original GP priors and the aggregation step.
翻译:基于模型的疾病制图仍是公共卫生与疾病监测领域基础性的政策指导工具,其中分层贝叶斯模型为当前最先进方法。在处理区域数据(如区、省等行政单元级别的聚合数据)时,常用模型依赖于区域单元的邻接结构来刻画空间相关性。疾病监测系统的目标在于追踪疾病结果的时序变化,但在政治变革等危机情境下,行政边界的变迁会对此构成挑战。肯尼亚即为典型例证。此外,基于邻接的方法忽视了空间过程的连续性,且无法解决支撑变换问题(即行政边界变更时的数据整合问题)。我们提出了一种新颖、实用且易于实施的解决方案,该方法融合了深度生成建模与完全贝叶斯推断。基于近期利用变分自编码器对小区空间先验进行编码的PriorVAE研究成果,我们将其应用于肯尼亚疟疾流行率制图。我们解决了因肯尼亚2010年区级边界调整引发的支撑变换问题:首先在表征连续空间的精细人工空间网格上采样高斯过程先验,再将这些样本聚合至行政边界层级,随后使用PriorVAE技术对聚合值进行编码。在推断阶段,我们以训练好的先验(aggVAE)替代原始高斯过程先验,并将其嵌入马尔可夫链蒙特卡洛框架。实验证明,基于连续先验聚合的灵活适配区域数据模型具有可行性,且采用aggVAE的推断速度比结合原始高斯过程先验与聚合步骤的方法快数个数量级。