We study the class of location-scale or heteroscedastic noise models (LSNMs), in which the effect $Y$ can be written as a function of the cause $X$ and a noise source $N$ independent of $X$, which may be scaled by a positive function $g$ over the cause, i.e., $Y = f(X) + g(X)N$. Despite the generality of the model class, we show the causal direction is identifiable up to some pathological cases. To empirically validate these theoretical findings, we propose two estimators for LSNMs: an estimator based on (non-linear) feature maps, and one based on neural networks. Both model the conditional distribution of $Y$ given $X$ as a Gaussian parameterized by its natural parameters. When the feature maps are correctly specified, we prove that our estimator is jointly concave, and a consistent estimator for the cause-effect identification task. Although the the neural network does not inherit those guarantees, it can fit functions of arbitrary complexity, and reaches state-of-the-art performance across benchmarks.
翻译:我们研究了一类位置尺度或异方差噪声模型(LSNMs),其中效应变量 $Y$ 可表示为原因变量 $X$ 与独立于 $X$ 的噪声源 $N$ 的函数,且该噪声可能通过一个正函数 $g$ 在原因上进行缩放,即 $Y = f(X) + g(X)N$。尽管模型类具有一般性,我们证明除某些病态情形外,因果方向是可识别的。为实证验证这些理论发现,我们提出了两种LSNM估计器:一种基于(非线性)特征映射,另一种基于神经网络。两者均将给定 $X$ 下 $Y$ 的条件分布建模为以自然参数为参数的高斯分布。当特征映射正确指定时,我们证明该估计器是联合凹的,且是因果方向识别任务的一致估计器。尽管神经网络不具备这些保证,但它能拟合任意复杂度的函数,并在多个基准测试中达到当前最优性能。