There has been a growing interest in capturing and maintaining causal relationships in Neural Network (NN) models in recent years. We study causal approaches to estimate and maintain input-output attributions in NN models in this work. In particular, existing efforts in this direction assume independence among input variables (by virtue of the NN architecture), and hence study only direct causal effects. Viewing an NN as a structural causal model (SCM), we instead focus on going beyond direct effects, introduce edges among input features, and provide a simple yet effective methodology to capture and maintain direct and indirect causal effects while training an NN model. We also propose effective approximation strategies to quantify causal attributions in high dimensional data. Our wide range of experiments on synthetic and real-world datasets show that the proposed ante-hoc method learns causal attributions for both direct and indirect causal effects close to the ground truth effects.
翻译:近年来,捕捉和维持神经网络模型中的因果关系日益受到关注。本研究探讨了估算和维持神经网络模型中输入-输出归因的因果方法。具体而言,现有相关研究(基于神经网络架构特性)假设输入变量相互独立,因此仅研究直接因果效应。我们将神经网络视为结构因果模型,重点突破直接效应的局限,引入输入特征间的连接,并提出一种简单有效的方法,在训练神经网络模型的同时捕捉和维持直接与间接因果效应。此外,我们还提出了适用于高维数据中量化因果归因的高效近似策略。在合成数据集和真实数据集上的广泛实验表明,所提出的前置方法学习的直接与间接因果归因均接近真实因果效应。