Discovering causal relations from observational data becomes possible with additional assumptions such as considering the functional relations to be constrained as nonlinear with additive noise (ANM). Even with strong assumptions, causal discovery involves an expensive search problem over the space of directed acyclic graphs (DAGs). \emph{Topological ordering} approaches reduce the optimisation space of causal discovery by searching over a permutation rather than graph space. For ANMs, the \emph{Hessian} of the data log-likelihood can be used for finding leaf nodes in a causal graph, allowing its topological ordering. However, existing computational methods for obtaining the Hessian still do not scale as the number of variables and the number of samples increase. Therefore, inspired by recent innovations in diffusion probabilistic models (DPMs), we propose \emph{DiffAN}\footnote{Implementation is available at \url{https://github.com/vios-s/DiffAN} .}, a topological ordering algorithm that leverages DPMs for learning a Hessian function. We introduce theory for updating the learned Hessian without re-training the neural network, and we show that computing with a subset of samples gives an accurate approximation of the ordering, which allows scaling to datasets with more samples and variables. We show empirically that our method scales exceptionally well to datasets with up to $500$ nodes and up to $10^5$ samples while still performing on par over small datasets with state-of-the-art causal discovery methods. Implementation is available at https://github.com/vios-s/DiffAN .
翻译:从观测数据中发现因果关系需借助额外假设,例如假定函数关系为带有加性噪声的非线性形式(ANM)。即使采用强假设,因果发现仍需在有向无环图(DAG)空间中进行代价高昂的搜索。拓扑排序方法通过搜索排列空间而非图空间,减少了因果发现的优化范围。对于ANM模型,数据对数似然的Hessian矩阵可用于定位因果图中的叶节点,进而确定其拓扑排序。然而,现有Hessian矩阵计算方法仍难以随变量数和样本数增加而扩展。受扩散概率模型(DPM)最新进展启发,我们提出DiffAN——一种利用DPM学习Hessian函数的拓扑排序算法。本文提出无需重新训练神经网络即可更新已学习Hessian的理论框架,并证明使用样本子集计算可精确近似排序结果,从而支持更大规模数据集(含更多样本和变量)。实验表明,本方法在含最多500个节点和10^5个样本的数据集上展现出卓越扩展性,同时在小型数据集上性能与最先进的因果发现方法持平。代码开源地址:https://github.com/vios-s/DiffAN