We deal with the combinatorial problem of learning directed acyclic graph (DAG) structure from observational data adhering to a linear structural equation model (SEM). Leveraging advances in differentiable, nonconvex characterizations of acyclicity, recent efforts have advocated a continuous constrained optimization paradigm to efficiently explore the space of DAGs. Most existing methods employ lasso-type score functions to guide this search, which (i) require expensive penalty parameter retuning when the $\textit{unknown}$ SEM noise variances change across problem instances; and (ii) implicitly rely on limiting homoscedasticity assumptions. In this work, we propose a new convex score function for sparsity-aware learning of linear DAGs, which incorporates concomitant estimation of scale and thus effectively decouples the sparsity parameter from the exogenous noise levels. Regularization via a smooth, nonconvex acyclicity penalty term yields CoLiDE ($\textbf{Co}$ncomitant $\textbf{Li}$near $\textbf{D}$AG $\textbf{E}$stimation), a regression-based criterion amenable to efficient gradient computation and closed-form estimation of noise variances in heteroscedastic scenarios. Our algorithm outperforms state-of-the-art methods without incurring added complexity, especially when the DAGs are larger and the noise level profile is heterogeneous. We also find CoLiDE exhibits enhanced stability manifested via reduced standard deviations in several domain-specific metrics, underscoring the robustness of our novel linear DAG estimator.
翻译:我们研究从符合线性结构方程模型(SEM)的观测数据中学习有向无环图(DAG)结构的组合优化问题。利用有向无环性的可微非凸刻画的最新进展,近期研究倡导采用连续约束优化范式来高效探索DAG空间。现有方法大多使用Lasso型评分函数引导搜索,这存在以下问题:(i)当$\textit{未知}$SEM噪声方差在不同问题实例间变化时,需要昂贵的惩罚参数重新调整;(ii)隐式依赖于同方差性假设的局限性。本文提出一种新的凸评分函数,用于稀疏感知的线性DAG学习,该函数通过引入尺度的伴随估计,有效将稀疏性参数与外生噪声水平解耦。结合光滑非凸无环性惩罚项的正则化,我们提出CoLiDE($\textbf{Co}$ncomitant $\textbf{Li}$near $\textbf{D}$AG $\textbf{E}$stimation)方法,该方法采用基于回归的准则,支持高效梯度计算,并在异方差场景下实现噪声方差的闭式估计。我们的算法在未增加额外复杂度的前提下,超越了现有最优方法——尤其是当DAG规模较大且噪声水平分布不均时。此外,CoLiDE展现出更强的稳定性,在多个领域特定指标中标准差显著降低,这凸显了我们新型线性DAG估计器的鲁棒性。