Rule-based models, such as decision trees, appeal to practitioners due to their interpretable nature. However, the learning algorithms that produce such models are often vulnerable to spurious associations and thus, they are not guaranteed to extract causally-relevant insights. In this work, we build on ideas from the invariant causal prediction literature to propose Invariant Causal Set Covering Machines, an extension of the classical Set Covering Machine algorithm for conjunctions/disjunctions of binary-valued rules that provably avoids spurious associations. We demonstrate both theoretically and empirically that our method can identify the causal parents of a variable of interest in polynomial time.
翻译:基于规则的模型(如决策树)因其可解释性而受到从业者青睐。然而,生成此类模型的学习算法常易受虚假关联影响,因而无法保证提取出因果相关的见解。在本研究中,我们基于不变因果预测理论的思想,提出不变因果集合覆盖机——这是经典集合覆盖机算法在二元值规则合取/析取上的扩展,能够从理论上保证避免虚假关联。我们从理论和实证两方面证明,该方法可以在多项式时间内识别目标变量的因果父节点。