Evaluating the performance of causal discovery algorithms that aim to find causal relationships between time-dependent processes remains a challenging topic. In this paper, we show that certain characteristics of datasets, such as varsortability (Reisach et al. 2021) and $R^2$-sortability (Reisach et al. 2023), also occur in datasets for autocorrelated stationary time series. We illustrate this empirically using four types of data: simulated data based on SVAR models and Erd\H{o}s-R\'enyi graphs, the data used in the 2019 causality-for-climate challenge (Runge et al. 2019), real-world river stream datasets, and real-world data generated by the Causal Chamber of (Gamella et al. 2024). To do this, we adapt var- and $R^2$-sortability to time series data. We also investigate the extent to which the performance of score-based causal discovery methods goes hand in hand with high sortability. Arguably, our most surprising finding is that the investigated real-world datasets exhibit high varsortability and low $R^2$-sortability indicating that scales may carry a significant amount of causal information.
翻译:评估旨在发现时间依赖过程间因果关系的因果发现算法的性能仍是一个具有挑战性的课题。本文表明,数据集的某些特征,如方差可排序性(Reisach 等人,2021)和 $R^2$-可排序性(Reisach 等人,2023),也出现在自相关平稳时间序列的数据集中。我们使用四类数据对此进行了实证说明:基于SVAR模型和Erd\H{o}s-R\'enyi图生成的模拟数据、2019年气候因果挑战赛(Runge 等人,2019)中使用的数据、真实世界的河流径流数据集,以及由(Gamella 等人,2024)的因果室生成的现实世界数据。为此,我们将方差可排序性和 $R^2$-可排序性适配到时间序列数据。我们还研究了基于分数的因果发现方法的性能在多大程度上与高可排序性相关联。可以说,我们最令人惊讶的发现是,所研究的现实世界数据集表现出高方差可排序性和低 $R^2$-可排序性,这表明数据的尺度可能携带大量的因果信息。