Methods for causal inference are well developed for binary and continuous exposures, but in many settings, the exposure has a substantial mass at zero-such exposures are called semi-continuous. We propose a general causal framework for such semi-continuous exposures, together with a novel two-stage estimation strategy. A two-part propensity structure is introduced for the semi-continuous exposure, with one component for exposure status (exposed vs unexposed) and another for the exposure level among those exposed, and incorporates both into a marginal structural model that disentangles the effects of exposure status and dose. The two-stage procedure sequentially targets the causal dose-response among exposed individuals and the causal effect of exposure status at a reference dose, allowing flexibility in the choice of propensity score methods in the second stage. We establish consistency and asymptotic normality for the resulting estimators, and characterise their limiting values under misspecification of the propensity score models. Simulation studies evaluate finite sample performance and robustness, and an application to a study of prenatal alcohol exposure and child cognition demonstrates how the proposed methods can be used to address a range of scientific questions about both exposure status and exposure intensity.
翻译:因果推断方法在二元暴露和连续暴露情形下已较为成熟,但许多实际场景中暴露变量在零点处存在大量聚集——此类暴露被称为半连续暴露。针对这类半连续暴露,我们提出了一个通用的因果推断框架及相应的两阶段估计策略。该框架为半连续暴露引入双组分倾向性结构:一个组分刻画暴露状态(暴露组与非暴露组),另一个描述暴露组内的暴露水平,并将二者纳入可分离暴露状态与剂量效应的边际结构模型。两阶段方法序贯估计暴露个体的因果剂量-反应关系及特定参考剂量下的暴露状态因果效应,第二阶段允许灵活选择倾向性评分方法。我们证明了所得估计量的一致性和渐近正态性,并刻画了倾向性评分模型误设时的极限取值。模拟研究评估了有限样本表现与稳健性,最后通过产前酒精暴露与儿童认知能力研究实例,展示了该方法如何用于解答关于暴露状态与暴露强度的各类科学问题。