Probabilities of Causation play a fundamental role in decision making in law, health care and public policy. Nevertheless, their point identification is challenging, requiring strong assumptions such as monotonicity. In the absence of such assumptions, existing work requires multiple observations of datasets that contain the same treatment and outcome variables, in order to establish bounds on these probabilities. However, in many clinical trials and public policy evaluation cases, there exist independent datasets that examine the effect of a different treatment each on the same outcome variable. Here, we outline how to significantly tighten existing bounds on the probabilities of causation, by imposing counterfactual consistency between SCMs constructed from such independent datasets ('causal marginal problem'). Next, we describe a new information theoretic approach on falsification of counterfactual probabilities, using conditional mutual information to quantify counterfactual influence. The latter generalises to arbitrary discrete variables and number of treatments, and renders the causal marginal problem more interpretable. Since the question of 'tight enough' is left to the user, we provide an additional method of inference when the bounds are unsatisfactory: A maximum entropy based method that defines a metric for the space of plausible SCMs and proposes the entropy maximising SCM for inferring counterfactuals in the absence of more information.
翻译:因果关系概率在法律、医疗和公共政策决策中具有基础性作用。然而,其逐点识别具有挑战性,需要诸如单调性等强假设。在缺乏此类假设的情况下,现有研究要求对包含相同处理变量和结果变量的数据集进行多次观测,以建立这些概率的边界。但在许多临床试验和公共政策评估案例中,存在多个独立数据集,每个数据集考察不同处理变量对同一结果变量的影响。本文首先阐述了如何通过施加由这些独立数据集构建的结构因果模型之间的反事实一致性(即"因果边际问题"),显著收紧因果关系概率的现有边界。随后,我们提出了一种新的基于信息论的反事实概率证伪方法,利用条件互信息量化反事实影响。该方法可推广至任意离散变量和处理数量,并使因果边际问题更具可解释性。由于"边界是否足够紧"需由用户判断,当边界不理想时,我们还提供了一种额外的推断方法:基于最大熵的方法,该方法为所有合理的SCM空间定义度量标准,并推荐在缺乏更多信息时用于推断反事实变量的最大熵SCM。