循环模型与隐藏混杂因子下因果发现方法的比较研究 (Comparative Study of Causal Discovery Methods for Cyclic Models with Hidden Confounders)

Nowadays, the need for causal discovery is ubiquitous. A better understanding of not just the stochastic dependencies between parts of a system, but also the actual cause-effect relations, is essential for all parts of science. Thus, the need for reliable methods to detect causal directions is growing constantly. In the last 50 years, many causal discovery algorithms have emerged, but most of them are applicable only under the assumption that the systems have no feedback loops and that they are causally sufficient, i.e. that there are no unmeasured subsystems that can affect multiple measured variables. This is unfortunate since those restrictions can often not be presumed in practice. Feedback is an integral feature of many processes, and real-world systems are rarely completely isolated and fully measured. Fortunately, in recent years, several techniques, that can cope with cyclic, causally insufficient systems, have been developed. And with multiple methods available, a practical application of those algorithms now requires knowledge of the respective strengths and weaknesses. Here, we focus on the problem of causal discovery for sparse linear models which are allowed to have cycles and hidden confounders. We have prepared a comprehensive and thorough comparative study of four causal discovery techniques: two versions of the LLC method [10] and two variants of the ASP-based algorithm [11]. The evaluation investigates the performance of those techniques for various experiments with multiple interventional setups and different dataset sizes.

翻译：如今，因果发现的需求无处不在。不仅需要理解系统各部分之间的随机依赖关系，更需要把握实际的因果关系，这对科学的所有领域都至关重要。因此，对可靠因果方向检测方法的需求持续增长。过去五十年间，涌现出许多因果发现算法，但其中大多数仅适用于系统无反馈环路且因果充分（即不存在能够影响多个观测变量的未测量子系统）的假设条件。这令人遗憾，因为在实际应用中这些限制条件往往无法成立。反馈是众多过程的基本特征，而现实世界系统很少完全孤立且能被完整测量。所幸近年来，已发展出若干能够处理循环且因果不充分系统的技术。随着多种方法的出现，实际应用这些算法时，需要了解各自方法的优势与局限。本文聚焦于允许存在循环和隐藏混杂因子的稀疏线性模型的因果发现问题。我们对四种因果发现技术进行了全面深入的比较研究：两种版本的LLC方法[10]以及两种基于ASP的算法变体[11]。评估工作通过多组干预设置和不同数据集规模的实验，系统考察了这些技术的性能表现。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日