Causal discovery from time series data encompasses many existing solutions, including those based on deep learning techniques. However, these methods typically do not endorse one of the most prevalent paradigms in deep learning: End-to-end learning. To address this gap, we explore what we call Causal Pretraining. A methodology that aims to learn a direct mapping from multivariate time series to the underlying causal graphs in a supervised manner. Our empirical findings suggest that causal discovery in a supervised manner is possible, assuming that the training and test time series samples share most of their dynamics. More importantly, we found evidence that the performance of Causal Pretraining can increase with data and model size, even if the additional data do not share the same dynamics. Further, we provide examples where causal discovery for real-world data with causally pretrained neural networks is possible within limits. We argue that this hints at the possibility of a foundation model for causal discovery.
翻译:时间序列数据中的因果发现已包含许多现有解决方案,包括基于深度学习技术的方案。然而,这些方法通常不采用深度学习中最普遍的范式之一:端到端学习。为填补这一空白,我们探索了所谓的因果预训练方法,该方法旨在以监督方式学习从多变量时间序列到潜在因果图的直接映射。我们的实证结果表明,假设训练和测试时间序列样本共享大部分动态特性,则监督式因果发现是可行的。更重要的是,我们发现证据表明,即使额外数据不共享相同动态特性,因果预训练的性能仍可随数据量和模型规模提升而增强。此外,我们提供了在有限范围内使用因果预训练神经网络对真实世界数据进行因果发现的示例。我们认为,这暗示了构建因果发现基础模型的可能性。