Selection bias is ubiquitous in real-world data, and can lead to misleading results if not dealt with properly. We introduce a conditioning operation on Structural Causal Models (SCMs) to model latent selection from a causal perspective. We show that the conditioning operation transforms an SCM with the presence of an explicit latent selection mechanism into an SCM without such selection mechanism, which partially encodes the causal semantics of the selected subpopulation according to the original SCM. Furthermore, we show that this conditioning operation preserves the simplicity, acyclicity, and linearity of SCMs, and commutes with marginalization. Thanks to these properties, combined with marginalization and intervention, the conditioning operation offers a valuable tool for conducting causal reasoning tasks within causal models where latent details have been abstracted away. We demonstrate by example how classical results of causal inference can be generalized to include selection bias and how the conditioning operation helps with modeling of real-world problems.
翻译:选择偏差在现实世界数据中普遍存在,若处理不当可能导致误导性结果。我们引入结构因果模型上的条件化操作,从因果视角对潜在选择进行建模。研究表明,该条件化操作可将显式潜在选择机制存在的SCM转化为无此类选择机制的SCM,该转化结果部分编码了原始SCM中选定子群体的因果语义。进一步地,我们证明该条件化操作能够保持SCM的简洁性、无环性和线性特征,并与边缘化操作可交换。凭借这些属性,结合边缘化与干预手段,条件化操作为在已抽象化潜在细节的因果模型中执行因果推理任务提供了有效工具。我们通过实例展示如何将经典因果推断结果推广到包含选择偏差的情形,以及条件化操作如何助力现实问题的建模。