Concomitant DAG Learning: On the Roles of Noise Adaptivity, Sparsity, and Non-negativity

from arxiv, Submitted to the IEEE Signal Processing Magazine Special Issue: From Signals to Causes: Methodological Advances in Causal Inference. arXiv admin note: text overlap with arXiv:2310.02895

Directed acyclic graphs (DAGs) constitute a central modeling tool to enable principled reasoning about cause-effect interactions in complex systems. However, since the causal structure underlying a group of variables is often unknown and interventions may be infeasible or ethically challenging to implement, there is a need to address the task of inferring DAGs from observational data. However, most classical structure identification approaches face two key obstacles: the combinatorial challenge of enforcing acyclicity, which severely limits scalability, and identifiability challenges arising from latent confounding or heterogeneous noise. This tutorial offers an overview of recent signal processing and optimization advances that address these issues by recasting DAG structure learning as a continuous, score-based estimation problem over adjacency matrices. We begin with a didactic introduction to structural equation models and the formulation of causal graph recovery, followed by a historical survey of score-based methods ranging from early combinatorial search schemes and greedy heuristics to modern continuous frameworks that leverage smooth characterizations of acyclicity. Building on this foundation, we describe concomitant DAG estimation methods that jointly infer sparse causal structure and exogenous noise levels, improving robustness under heteroscedasticity and distribution shifts by rendering the estimator noise adaptive. All in all, the tutorial introduces readers to challenges and opportunities for signal processing research at the crossroads of causal inference, high-dimensional statistics, and scalable graph learning, while outlining emerging directions including online, nonlinear, and neural causal discovery.

翻译：有向无环图（DAG）作为核心建模工具，能够对复杂系统中的因果交互作用进行严谨推理。然而，由于变量组背后的因果结构往往未知，且干预实验可能不可行或存在伦理挑战，因此需要解决从观测数据中推断DAG的任务。但大多数经典结构识别方法面临两大关键障碍：强制执行无环性带来的组合优化挑战（严重制约可扩展性），以及由潜在混杂因素或异质性噪声引发的可辨识性问题。本教程概述了信号处理与优化领域的最新进展，通过将DAG结构学习重新定义为邻接矩阵上的连续评分估计问题来解决这些难题。我们首先从结构方程模型和因果图恢复的公式化表达进行教学性介绍，继而系统回顾基于评分的方法发展历程——涵盖早期组合搜索方案、贪婪启发式算法，以及利用无环性平滑表征的现代连续框架。在此基础上，我们描述了伴随DAG估计方法，该方法能联合推断稀疏因果结构与外生噪声水平，通过赋予估计器噪声自适应能力，提升其在异方差和分布偏移下的鲁棒性。总体而言，本教程为读者揭示了因果推断、高维统计与可扩展图学习交叉领域中的信号处理研究挑战与机遇，并展望了在线因果发现、非线性因果发现及神经因果发现等新兴方向。