Directed Acyclic Graphs (DAGs) are central to uncovering causal structure in complex systems, yet learning a single DAG from data is often challenging: model uncertainty, finite samples, and a combinatorially large search space frequently yield unstable estimates. We propose DAGgr, a model averaging framework that aggregates multiple candidate DAGs into a single stable representation. Candidate graphs are weighted by their out-of-sample predictive likelihood across repeated data splits, and a thresholding rule on the resulting edge-importance scores guarantees that the aggregated graph is itself acyclic. We establish a finite-sample risk bound, prove that the procedure preserves acyclicity, and show that edge selection is consistent under mild conditions on the weights. Simulations across random, hub, and chain structures, together with an analysis of the Sachs et al. (2005) protein-signaling network, show that DAGgr matches or exceeds the best individual candidate while consistently outperforming bootstrap-aggregation baselines across structural recovery metrics.
翻译:有向无环图(DAG)是揭示复杂系统中因果结构的核心工具,然而从数据中学习单一DAG通常面临挑战:模型不确定性、有限样本以及组合爆炸式的搜索空间常导致估计结果不稳定。我们提出DAGgr框架——一种通过聚合多个候选DAG生成单一稳定表示的模型平均方法。候选图基于跨重复数据划分的样本外预测似然性进行加权,并通过边缘重要性得分的阈值规则确保聚合后的图本身保持无环性。我们建立了有限样本风险界限,证明了该过程能保持无环性,并在权重满足温和条件下证明了边缘选择的一致性。针对随机图、枢纽图与链状结构的仿真实验,以及对Sachs等(2005)蛋白质信号网络的分析表明,DAGgr在结构恢复指标上持平或超越最佳单一候选模型,同时始终优于基于自助法的聚合基线方法。