The use of simulated data in the field of causal discovery is ubiquitous due to the scarcity of annotated real data. Recently, Reisach et al., 2021 highlighted the emergence of patterns in simulated linear data, which displays increasing marginal variance in the casual direction. As an ablation in their experiments, Montagna et al., 2023 found that similar patterns may emerge in nonlinear models for the variance of the score vector $\nabla \log p_{\mathbf{X}}$, and introduced the ScoreSort algorithm. In this work, we formally define and characterize this score-sortability pattern of nonlinear additive noise models. We find that it defines a class of identifiable (bivariate) causal models overlapping with nonlinear additive noise models. We theoretically demonstrate the advantages of ScoreSort in terms of statistical efficiency compared to prior state-of-the-art score matching-based methods and empirically show the score-sortability of the most common synthetic benchmarks in the literature. Our findings remark (1) the lack of diversity in the data as an important limitation in the evaluation of nonlinear causal discovery approaches, (2) the importance of thoroughly testing different settings within a problem class, and (3) the importance of analyzing statistical properties in causal discovery, where research is often limited to defining identifiability conditions of the model.
翻译:由于标注真实数据的稀缺性,模拟数据在因果发现领域的使用十分普遍。近期,Reisach等人(2021)指出模拟线性数据中出现的模式,即在因果方向上边缘方差递增。作为实验中的消融分析,Montagna等人(2023)发现类似模式可能出现在非线性模型中评分向量 $\nabla \log p_{\mathbf{X}}$ 的方差上,并提出了ScoreSort算法。本文正式定义并描述了非线性加性噪声模型的这一可排序评分模式。我们发现,该模式定义了一类与非线性加性噪声模型部分重叠的可识别(双变量)因果模型。我们从理论上证明了ScoreSort相较于现有最先进的基于评分匹配的方法在统计效率上的优势,并通过实验展示了文献中最常见合成基准数据的评分可排序性。我们的研究结果强调了:(1)数据缺乏多样性是非线性因果发现方法评估中的重要局限性;(2)在问题类别内全面测试不同设置的重要性;(3)在通常仅局限于定义模型可识别性条件的因果发现研究中,分析统计性质的重要性。