The widely used 'Counterfactual' definition of Causal Effects was derived for unbiasedness and accuracy - and not generalizability. We propose a Combinatorial definition for the External Validity (EV) of intervention effects. We first define the concept of an effect observation 'background'. We then formulate conditions for effect generalization based on their sets of (observable and unobservable) backgrounds. This reveals two limits for effect generalization: (1) when effects are observed under all their enumerable backgrounds, or, (2) when backgrounds have become sufficiently randomized. We use the resulting combinatorial framework to re-examine several issues in the original counterfactual formulation: out-of-sample validity, concurrent estimation of multiple effects, bias-variance tradeoffs, statistical power, and connections to current predictive and explaining techniques. Methodologically, the definitions also allow us to also replace the parametric estimation problems that followed the counterfactual definition by combinatorial enumeration and randomization problems in non-experimental samples. We use this non-parametric framework to demonstrate (External Validity, Unconfoundness and Precision) tradeoffs in the performance of popular supervised, explaining, and causal-effect estimators. We demonstrate the approach also allows for the use of these methods in non-i.i.d. samples. The COVID19 pandemic highlighted the need for learning solutions to provide predictions in severally incomplete samples. We demonstrate applications in this pressing problem.
翻译:广泛使用的因果效应“反事实”定义源自无偏性与准确性,而非泛化能力。本文提出干预效应外部有效性的组合定义。首先定义效应观测“背景”概念,进而基于(可观测与不可观测)背景集合构建效应泛化条件。这揭示了效应泛化的两种极限情形:(1)当效应在其所有可枚举背景下被观测时;(2)当背景已充分随机化时。利用该组合框架重新审视原始反事实表述中的若干问题:样本外有效性、多效应联合估计、偏差-方差权衡、统计效能以及与当前预测和解释技术的关联。方法论上,该定义还允许将反事实定义后续的参数估计问题转化为非实验样本中的组合枚举与随机化问题。通过该非参数框架,我们展示了常用监督学习、解释性模型及因果效应估计器在(外部有效性、非混淆性与精确性)之间的权衡。该方法还可应用于非独立同分布样本。COVID-19疫情凸显了在不完整样本中提供预测的学习解决方案的需求。本文就该紧迫问题展示应用实例。