This study quantifies the association between air pollution and mortality in Ontario, Canada. Exposure-response relationships in air pollution epidemiology are complex due to three features: time-lagged associations, non-linear associations, and multiple pollutants. To address the first two features, two distinct classes of distributed lag non-linear model (DLNM) have been proposed, but extending them to multiple exposures and selecting an appropriate model remain challenging. We propose a unified framework for multiple exposure DLNMs, integrating model specification, estimation, selection and stacking. The framework applies to four different model structures: two additive and two proposed single-index DLNMs, all applicable to general outcome types, including the mortality counts in the motivating application. We develop an estimation approach that applies to all four models. Choosing among the candidate DLNMs is challenging a priori, and we derive an AIC to select among them. As an alternative to selecting a single model, we also extend a model stacking approach to combine inferences across the four DLNMs and propose an implementation scalable to our dataset with 106,346 observations. In the motivating analysis, the four DLNMs yield different estimates, and the proposed stacking approach identifies significant associations between respiratory mortality and a mixture of PM2.5, O3 and NO2.
翻译:本研究量化了加拿大安大略省空气污染与死亡率之间的关联。在空气污染流行病学中,暴露-反应关系具有三个复杂特征:时间滞后关联、非线性关联以及多种污染物并存。为解决前两个特征,学界已提出两类不同的分布式滞后非线性模型,但将其扩展至多暴露场景并选择合适模型仍具挑战性。我们提出多暴露DLNMs的统一框架,整合了模型规范、估计、选择与叠加。该框架适用于四种不同模型结构:两种加性DLNMs与两种提出的单一指数DLNMs,均适用于一般结局类型,包括启发性应用中的死亡率计数。我们开发了适用于所有四种模型的估计方法。由于候选DLNMs的先验选择具有挑战性,我们推导出AIC用于模型选择。作为单一模型选择的替代方案,我们还扩展了模型叠加方法以综合四种DLNMs的推断结果,并提出了适用于本数据集(含106,346个观测值)的可扩展实现方案。在启发性分析中,四种DLNMs产生不同估计值,而所提出的叠加方法识别出PM2.5、O3和NO2混合物与呼吸系统死亡率之间的显著关联。