Causal inference in a sub-population involves identifying the causal effect of an intervention on a specific subgroup, which is distinguished from the whole population through the influence of systematic biases in the sampling process. However, ignoring the subtleties introduced by sub-populations can either lead to erroneous inference or limit the applicability of existing methods. We introduce and advocate for a causal inference problem in sub-populations (henceforth called s-ID), in which we merely have access to observational data of the targeted sub-population (as opposed to the entire population). Existing inference problems in sub-populations operate on the premise that the given data distributions originate from the entire population, thus, cannot tackle the s-ID problem. To address this gap, we provide necessary and sufficient conditions that must hold in the causal graph for a causal effect in a sub-population to be identifiable from the observational distribution of that sub-population. Given these conditions, we present a sound and complete algorithm for the s-ID problem.
翻译:子群体因果推断涉及识别特定干预对某一子群体的因果效应,该子群体因抽样过程中的系统性偏差而与整体人群区分开来。然而,忽略子群体带来的微妙差异可能导致错误推断,或限制现有方法的适用性。我们提出并倡导子群体中的因果推断问题(以下称为s-ID问题),在此问题中我们仅能获取目标子群体的观测数据(而非整个群体的数据)。现有子群体推断问题均基于给定数据分布源自整体人群的前提假设,因此无法解决s-ID问题。为填补这一空白,我们给出了因果图中子群体因果效应可被该子群体观测分布识别的充要条件。基于这些条件,我们提出了针对s-ID问题的完备且正确的算法。