Causal discovery from time-series data has been a central task in machine learning. Recently, Granger causality inference is gaining momentum due to its good explainability and high compatibility with emerging deep neural networks. However, most existing methods assume structured input data and degenerate greatly when encountering data with randomly missing entries or non-uniform sampling frequencies, which hampers their applications in real scenarios. To address this issue, here we present CUTS, a neural Granger causal discovery algorithm to jointly impute unobserved data points and build causal graphs, via plugging in two mutually boosting modules in an iterative framework: (i) Latent data prediction stage: designs a Delayed Supervision Graph Neural Network (DSGNN) to hallucinate and register unstructured data which might be of high dimension and with complex distribution; (ii) Causal graph fitting stage: builds a causal adjacency matrix with imputed data under sparse penalty. Experiments show that CUTS effectively infers causal graphs from unstructured time-series data, with significantly superior performance to existing methods. Our approach constitutes a promising step towards applying causal discovery to real applications with non-ideal observations.
翻译:从时间序列数据中进行因果发现一直是机器学习领域的一项核心任务。近年来,格兰杰因果关系推断因其良好的可解释性以及与新兴深度神经网络的高度兼容性而受到广泛关注。然而,大多数现有方法假设输入数据是结构化的,在遇到随机缺失条目或非均匀采样频率的数据时会严重退化,这阻碍了它们在实际场景中的应用。为解决这一问题,我们提出了CUTS,一种神经格兰杰因果发现算法,通过在一个迭代框架中嵌入两个相互促进的模块,共同对未观测数据点进行插补并构建因果图:(i) 潜在数据预测阶段:设计了一个延迟监督图神经网络(DSGNN)来想象并注册可能具有高维度和复杂分布的非结构化数据;(ii) 因果图拟合阶段:在稀疏惩罚下利用插补数据构建因果邻接矩阵。实验表明,CUTS能够有效从非结构化时间序列数据中推断因果图,其性能显著优于现有方法。我们的方法为将因果发现应用于具有非理想观测的真实场景迈出了有希望的一步。