Typical causal effects are defined based on the marginal distribution of potential outcomes. However, many real-world applications require causal estimands involving the joint distribution of potential outcomes to enable more nuanced treatment evaluation and selection. In this article, we propose a novel framework for identifying and estimating the joint distribution of potential outcomes using multiple experimental datasets. We introduce the assumption of transportability of state transition probabilities for potential outcomes across datasets and establish the identification of the joint distribution under this assumption, along with a regular full-column rank condition. The key identification assumptions are testable in an overidentified setting and are analogous to those in the context of instrumental variables, with the dataset indicator serving as "instrument". Moreover, we propose an easy-to-use least-squares-based estimator for the joint distribution of potential outcomes in each dataset, proving its consistency and asymptotic normality. We further extend the proposed framework to identify and estimate principal causal effects. We empirically demonstrate the proposed framework by conducting extensive simulations and applying it to evaluate the surrogate endpoint in a real-world application.
翻译:典型因果效应通常基于潜在结果的边际分布定义。然而,许多实际应用需要涉及潜在结果联合分布的因果估计量,以实现更精细的处理评估与选择。本文提出一个新颖框架,利用多组实验数据识别并估计潜在结果的联合分布。我们引入跨数据集潜在结果状态转移概率的可迁移性假设,并在此假设及正则满秩条件下建立联合分布的可识别性。关键识别假设在过度识别设定下可检验,且与工具变量情境的假设类似,其中数据集标识符充当“工具”。此外,我们为每个数据集的潜在结果联合分布提出易于使用的基于最小二乘的估计量,证明其一致性与渐近正态性。进一步将所提框架扩展至主要因果效应的识别与估计。通过大规模模拟实验及真实应用中替代终点的评估,实证验证了所提框架的有效性。