Rule-based models, such as decision trees, appeal to practitioners due to their interpretable nature. However, the learning algorithms that produce such models are often vulnerable to spurious associations and thus, they are not guaranteed to extract causally-relevant insights. In this work, we build on ideas from the invariant causal prediction literature to propose Invariant Causal Set Covering Machines, an extension of the classical Set Covering Machine algorithm for conjunctions/disjunctions of binary-valued rules that provably avoids spurious associations. We demonstrate both theoretically and empirically that our method can identify the causal parents of a variable of interest in polynomial time.
翻译:基于规则的模型(如决策树)因其可解释性而受到实践者青睐。然而,生成此类模型的学习算法常常对虚假关联敏感,因此无法保证提取出因果相关的见解。本文基于不变因果预测理论的思想,提出不变因果集覆盖机(Invariant Causal Set Covering Machines),该算法是对经典集覆盖机(Set Covering Machine)算法的扩展,用于处理二值规则的合取/析取,并可证明避免虚假关联。我们从理论和实证两方面证明,该方法能在多项式时间内识别目标变量的因果父节点。