Epidemiological investigations of regionally aggregated spatial data often involve detecting spatial health disparities between neighboring regions on a map of disease mortality or incidence rates. Analyzing such data introduces spatial dependence among the health outcomes and seeks to report statistically significant spatial disparities by delineating boundaries that separate neighboring regions with widely disparate health outcomes. However, current statistical methods are often inadequate for appropriately defining what constitutes a spatial disparity and for constructing rankings of posterior probabilities that are robust under changes to such a definition. More specifically, non-parametric Bayesian approaches endow spatial effects with discrete probability distributions using Dirichlet processes, or generalizations thereof, and rely upon computationally intensive methods for inferring on weakly identified parameters. In this manuscript, we introduce a Bayesian linear regression framework to detect spatial health disparities. This enables us to exploit Bayesian conjugate posterior distributions in a more accessible manner and accelerate computation significantly over existing Bayesian non-parametric approaches. Simulation experiments conducted over a county map of the entire United States demonstrate the effectiveness of our method and we apply our method to a data set from the Institute of Health Metrics and Evaluation (IHME) on age-standardized US county-level estimates of mortality rates across tracheal, bronchus, and lung cancer.
翻译:对区域汇总空间数据的流行病学调查通常涉及检测疾病死亡率或发病率地图上相邻区域之间的空间健康差异。分析此类数据会引入健康结果之间的空间依赖性,并试图通过划分边界来报告具有统计显著性的空间差异,这些边界将健康结果差异显著的相邻区域分隔开来。然而,当前的统计方法往往不足以恰当定义何为空间差异,并且难以构建在定义变化下保持稳健的后验概率排序。更具体地说,非参数贝叶斯方法通过狄利克雷过程或其推广,赋予空间效应离散概率分布,并依赖计算密集的方法来推断弱识别参数。在本研究中,我们引入了一个贝叶斯线性回归框架来检测空间健康差异。这使我们能够以更易于理解的方式利用贝叶斯共轭后验分布,并显著加速计算,超越了现有的贝叶斯非参数方法。在美国县级地图上进行的模拟实验证明了我们方法的有效性,并且我们将该方法应用于健康指标与评估研究所(IHME)关于美国县级气管癌、支气管癌和肺癌年龄标准化死亡率估计的数据集。