In this paper, we develop a generic methodology to encode hierarchical causality structure among observed variables into a neural network in order to improve its predictive performance. The proposed methodology, called causality-informed neural network (CINN), leverages three coherent steps to systematically map the structural causal knowledge into the layer-to-layer design of neural network while strictly preserving the orientation of every causal relationship. In the first step, CINN discovers causal relationships from observational data via directed acyclic graph (DAG) learning, where causal discovery is recast as a continuous optimization problem to avoid the combinatorial nature. In the second step, the discovered hierarchical causality structure among observed variables is systematically encoded into neural network through a dedicated architecture and customized loss function. By categorizing variables in the causal DAG as root, intermediate, and leaf nodes, the hierarchical causal DAG is translated into CINN with a one-to-one correspondence between nodes in the causal DAG and units in the CINN while maintaining the relative order among these nodes. Regarding the loss function, both intermediate and leaf nodes in the DAG graph are treated as target outputs during CINN training so as to drive co-learning of causal relationships among different types of nodes. As multiple loss components emerge in CINN, we leverage the projection of conflicting gradients to mitigate gradient interference among the multiple learning tasks. Computational experiments across a broad spectrum of UCI data sets demonstrate substantial advantages of CINN in predictive performance over other state-of-the-art methods. In addition, an ablation study underscores the value of integrating structural and quantitative causal knowledge in enhancing the neural network's predictive performance incrementally.
翻译:本文提出了一种通用方法论,通过将观测变量间的层次因果结构编码到神经网络中,以提升其预测性能。所提出的方法名为因果信息神经网络(CINN),通过三个连贯步骤将结构因果知识系统性地映射到神经网络的层间设计中,同时严格保留每个因果关系的方向性。第一步,CINN通过有向无环图(DAG)学习从观测数据中发现因果关系,将因果发现重构为连续优化问题以规避组合爆炸特性。第二步,将观测变量间的层次因果结构通过专用架构和定制损失函数系统性地编码到神经网络中。通过将因果DAG中的变量分类为根节点、中间节点和叶节点,层次因果DAG被转化为CINN,使得因果DAG中的节点与CINN中的单元保持一一对应关系,同时维持这些节点间的相对顺序。在损失函数方面,DAG图中的中间节点和叶节点在CINN训练中均被作为目标输出,从而驱动不同类型节点间因果关系的协同学习。由于CINN中产生多个损失分量,我们利用冲突梯度投影法来缓解多学习任务间的梯度干扰。在广泛UCI数据集上的计算实验表明,CINN在预测性能上相较于其他先进方法具有显著优势。此外,消融研究进一步证实了将结构性与定量因果知识相结合对逐步增强神经网络预测性能的关键价值。