Invariant causal prediction provides a useful framework for identifying causal predictors of a response using heterogeneous data from multiple environments. One valuable property of the original invariant causal prediction method is that it guarantees no false causal discoveries with high probability. Such a guarantee, however, can be overly conservative in some applications, resulting in few or no causal discoveries. This raises a natural question: can invariant causal prediction be equipped with less conservative error guarantees and thereby extract more causal information from the data? In this paper, we address this question by focusing on two widely used and more liberal guarantees: false discovery rate control and simultaneous true discovery bounds. A key step in our approach is to reformulate invariant causal prediction as a multiple testing problem. We then adopt the e-Closure principle to obtain (simultaneous) false discovery rate control, together with new p-to-e calibrators tailored to this setting. We also derive simultaneous true discovery bounds via closed testing, which provide additional causal information without requiring extra assumptions and retain all discoveries from the original invariant causal prediction method. Through simulations and a real data application on educational attainment of teenagers in the United States, we show that these more liberal error control guarantees can improve the practical usefulness of invariant causal prediction.
翻译:不变因果预测为利用来自多个环境的异质数据识别响应的因果预测因子提供了一个有用的框架。原始不变因果预测方法的一个宝贵特性是,它能以高概率保证无虚假因果发现。然而,这种保证在某些应用中可能过于保守,导致发现很少或没有因果发现。这便引出了一个自然的问题:能否为不变因果预测配备更不保守的错误保证,从而从数据中提取更多因果信息?在本文中,我们通过聚焦于两种广泛使用且更宽松的保证来解决这一问题:错误发现率控制和同步真实发现界限。我们方法的关键步骤是将不变因果预测重新表述为一个多重检验问题。随后,我们采用e-闭合原理来获得(同步的)错误发现率控制,并结合针对此场景定制的新的p值到e值校准器。我们还通过闭合检验推导出同步真实发现界限,该界限无需额外假设即可提供额外的因果信息,并保留原始不变因果预测方法的所有发现。通过模拟实验以及一项关于美国青少年教育成就的真实数据应用,我们展示了这些更宽松的错误控制保证能提升不变因果预测的实用性。