We present a boosting-based method to learn additive Structural Equation Models (SEMs) from observational data, with a focus on the theoretical aspects of determining the causal order among variables. We introduce a family of score functions based on arbitrary regression techniques, for which we establish necessary conditions to consistently favor the true causal ordering. Our analysis reveals that boosting with early stopping meets these criteria and thus offers a consistent score function for causal orderings. To address the challenges posed by high-dimensional data sets, we adapt our approach through a component-wise gradient descent in the space of additive SEMs. Our simulation study underlines our theoretical results for lower dimensions and demonstrates that our high-dimensional adaptation is competitive with state-of-the-art methods. In addition, it exhibits robustness with respect to the choice of the hyperparameters making the procedure easy to tune.
翻译:我们提出一种基于提升的方法,用于从观测数据中学习可加结构方程模型(SEMs),着重于确定变量间因果顺序的理论层面。我们引入了一系列基于任意回归技术的评分函数,并为其建立了一致性地偏向于真实因果顺序的必要条件。我们的分析表明,采用早停法的提升方法满足这些条件,因此为因果顺序提供了一致的评分函数。为应对高维数据集带来的挑战,我们通过在可加SEM空间中进行分量梯度下降来调整方法。我们的仿真研究在低维情况下验证了理论结果,并表明高维适应性方法与最先进技术相比具有竞争力。此外,该方法对超参数选择表现出鲁棒性,使流程易于调优。