The discovery of causal relationships in a set of random variables is a fundamental objective of science and has also recently been argued as being an essential component towards real machine intelligence. One class of causal discovery techniques are founded based on the argument that there are inherent structural asymmetries between the causal and anti-causal direction which could be leveraged in determining the direction of causation. To go about capturing these discrepancies between cause and effect remains to be a challenge and many current state-of-the-art algorithms propose to compare the norms of the kernel mean embeddings of the conditional distributions. In this work, we argue that such approaches based on RKHS embeddings are insufficient in capturing principal markers of cause-effect asymmetry involving higher-order structural variabilities of the conditional distributions. We propose Kernel Intrinsic Invariance Measure with Heterogeneous Transform (KIIM-HT) which introduces a novel score measure based on heterogeneous transformation of RKHS embeddings to extract relevant higher-order moments of the conditional densities for causal discovery. Inference is made via comparing the score of each hypothetical cause-effect direction. Tests and comparisons on a synthetic dataset, a two-dimensional synthetic dataset and the real-world benchmark dataset T\"ubingen Cause-Effect Pairs verify our approach. In addition, we conduct a sensitivity analysis to the regularization parameter to faithfully compare previous work to our method and an experiment with trials on varied hyperparameter values to showcase the robustness of our algorithm.
翻译:在一组随机变量中发现因果关系是科学的基本目标,近年来也被认为是实现真正机器智能的关键组成部分。一类因果发现技术基于以下论点:因果方向与反因果方向之间存在内在的结构不对称性,这些不对称性可用于确定因果关系方向。如何捕捉因与果之间的这些差异仍然是一个挑战,目前许多最先进的算法提出比较条件分布核均值嵌入的范数。本文认为,基于再生核希尔伯特空间(RKHS)嵌入的方法不足以捕捉涉及条件分布高阶结构变异性的因果不对称性的主要标志。我们提出了核内在不变性度量与异构变换(KIIM-HT),该方法引入了一种基于RKHS嵌入异构变换的新颖评分度量,以提取用于因果发现的条件密度的高阶相关矩。通过比较每个假设因果方向的评分进行推断。在合成数据集、二维合成数据集以及真实世界基准数据集图宾根因果对上的测试和比较验证了我们的方法。此外,我们对正则化参数进行了敏感性分析,以忠实地将先前工作与我们的方法进行比较,并进行了不同超参数值试验的实验,以展示我们算法的鲁棒性。