The widely used 'Counterfactual' definition of Causal Effects was derived for unbiasedness and accuracy - and not generalizability. We propose a Combinatorial definition for the External Validity (EV) of intervention effects. We first define the concept of an effect observation 'background'. We then formulate conditions for effect generalization based on their sets of (observed and unobserved) backgrounds. This reveals two limits for effect generalization: (1) when effects are observed under all their enumerable backgrounds, or, (2) when backgrounds have become sufficiently randomized. We use the resulting combinatorial framework to re-examine several issues in the original counterfactual formulation: out-of-sample validity, concurrent estimation of multiple effects, bias-variance tradeoffs, statistical power, and connections to current predictive and explaining techniques. Methodologically, the definitions also allow us to replace the parametric estimation problems that followed the counterfactual definition by combinatorial enumeration and randomization problems in non-experimental samples. We use this non-parametric framework to demonstrate (External Validity, Unconfoundness and Precision) tradeoffs in the performance of popular supervised, explaining, and causal-effect estimators. We also illustrate how the approach allows for the use of supervised and explaining methods in non-i.i.d. samples. The COVID19 pandemic highlighted the need for learning solutions to provide predictions in severally incomplete samples. We demonstrate applications in this pressing problem.
翻译:广泛使用的因果效应“反事实”定义基于无偏性和准确性推导,而非泛化能力。我们提出一种组合论定义来表述干预效应的外部有效性。首先定义效应观测“背景”概念,进而基于效应(已观测与未观测)背景集合推导泛化条件。该框架揭示了效应泛化的两个极限:(1)效应在其所有可枚举背景下均被观测时;(2)背景已充分随机化时。我们利用该组合框架重新审视原始反事实定义中的若干问题:样本外有效性、多效应联合估计、偏差-方差权衡、统计功效,以及与当前预测和解释技术的关联。方法论层面,该定义将反事实框架下的参数估计问题转化为非实验样本中的组合枚举与随机化问题。基于该非参数框架,我们展示了主流监督、解释与因果效应估计器在(外部有效性、无混杂性、精确性)三方面的权衡关系,并阐明了该方法如何在非独立同分布样本中应用监督与解释技术。新冠疫情凸显了在严重不完整样本中提供预测的学习解决方案需求。我们在此紧迫问题中展示了相关应用。