Nowadays, the need for causal discovery is ubiquitous. A better understanding of not just the stochastic dependencies between parts of a system, but also the actual cause-effect relations, is essential for all parts of science. Thus, the need for reliable methods to detect causal directions is growing constantly. In the last 50 years, many causal discovery algorithms have emerged, but most of them are applicable only under the assumption that the systems have no feedback loops and that they are causally sufficient, i.e. that there are no unmeasured subsystems that can affect multiple measured variables. This is unfortunate since those restrictions can often not be presumed in practice. Feedback is an integral feature of many processes, and real-world systems are rarely completely isolated and fully measured. Fortunately, in recent years, several techniques, that can cope with cyclic, causally insufficient systems, have been developed. And with multiple methods available, a practical application of those algorithms now requires knowledge of the respective strengths and weaknesses. Here, we focus on the problem of causal discovery for sparse linear models which are allowed to have cycles and hidden confounders. We have prepared a comprehensive and thorough comparative study of four causal discovery techniques: two versions of the LLC method [10] and two variants of the ASP-based algorithm [11]. The evaluation investigates the performance of those techniques for various experiments with multiple interventional setups and different dataset sizes.
翻译:当前,因果发现已成为各科学领域普遍需求。理解系统中各组成部分不仅存在随机依赖关系,更需掌握实际因果关系,这对科学研究的各个分支都至关重要。因此,对可靠因果方向检测方法的需求与日俱增。过去五十年间涌现出大量因果发现算法,但多数算法仅适用于无反馈回路且因果充分性(即不存在可同时影响多个测量变量的未测量子系统)的系统。遗憾的是,这些约束条件在实际应用中往往难以满足:反馈机制是许多过程的固有特征,而现实系统极少能被完全隔离并全面测量。值得庆幸的是,近年来已开发出多种适用于循环性因果不充分系统的技术。面对多种可用方法,实际应用这些算法需要明确各自的优势与局限。本研究聚焦于允许存在循环和隐混淆因子的稀疏线性模型中的因果发现问题,对四种因果发现技术进行了全面深入的比较研究:两种LLC方法变体[10]与两种基于ASP的算法变体[11]。评估实验考察了这些方法在不同干预设置和多种数据集规模下的性能表现。