The field of environmental epidemiology has placed an increasing emphasis on understanding the health effects of mixtures of metals, chemicals, and pollutants in recent years. Bayesian Kernel Machine Regression (BKMR) is a statistical method that has gained significant traction in environmental mixture studies due to its ability to account for complex non-linear relationships between the exposures and health outcome and its ability to identify interaction effects between the exposures. However, BKMR makes the crucial assumption that the error terms have a constant variance, and this assumption is not typically checked in practice. In this paper, we create a diagnostic function for checking this constant variance assumption in practice and develop Heteroscedastic BKMR (HBKMR) for environmental mixture analyses where this assumption is not met. By specifying a Bayesian hierarchical variance model for the error term variance parameters, HBKMR produces updated estimates of the environmental mixture's health effects and their corresponding 95% credible intervals. We apply HBKMR in two real-world case studies that motivated this work: 1) Examining the effects of prenatal metal exposures on behavioral problems in toddlers living in Suriname and 2) Assessing the impacts of metal exposures on simple reaction time in children living near coal-fired power plants in Kentucky. In both case studies, HBKMR provides a substantial improvement in model fit compared to BKMR, with differences in some of the mixture effect estimates and typically narrower 95% credible intervals after accounting for the heteroscedasticity.
翻译:近年来,环境流行病学领域日益重视理解金属、化学物质和污染物混合物的健康效应。贝叶斯核机回归(BKMR)作为一种统计方法,在环境混合物研究中获得了广泛关注,因其能够处理暴露因素与健康结局之间复杂的非线性关系,并能识别暴露因素间的交互效应。然而,BKMR基于一个关键假设:误差项具有恒定方差,而这一假设在实践中通常未被检验。本文构建了一种诊断函数,用于在实践中检验该恒定方差假设,并针对不满足该假设的环境混合物分析,提出了异方差贝叶斯核机回归(HBKMR)。通过对误差项方差参数设定贝叶斯层次方差模型,HBKMR能够提供环境混合物健康效应的更新估计及其相应的95%可信区间。我们将HBKMR应用于两项激发本研究的实际案例:1)研究产前金属暴露对苏里南幼儿行为问题的影响;2)评估金属暴露对肯塔基州燃煤电厂附近儿童简单反应时间的影响。在这两项案例研究中,与BKMR相比,HBKMR在模型拟合度上均有显著提升,在考虑异方差性后,部分混合物效应估计值存在差异,且95%可信区间通常更窄。