Proper hyperparameter tuning is essential for achieving optimal performance of modern machine learning (ML) methods in predictive tasks. While there is an extensive literature on tuning ML learners for prediction, there is only little guidance available on tuning ML learners for causal machine learning and how to select among different ML learners. In this paper, we empirically assess the relationship between the predictive performance of ML methods and the resulting causal estimation based on the Double Machine Learning (DML) approach by Chernozhukov et al. (2018). DML relies on estimating so-called nuisance parameters by treating them as supervised learning problems and using them as plug-in estimates to solve for the (causal) parameter. We conduct an extensive simulation study using data from the 2019 Atlantic Causal Inference Conference Data Challenge. We provide empirical insights on the role of hyperparameter tuning and other practical decisions for causal estimation with DML. First, we assess the importance of data splitting schemes for tuning ML learners within Double Machine Learning. Second, we investigate how the choice of ML methods and hyperparameters, including recent AutoML frameworks, impacts the estimation performance for a causal parameter of interest. Third, we assess to what extent the choice of a particular causal model, as characterized by incorporated parametric assumptions, can be based on predictive performance metrics.
翻译:超参数调优对于现代机器学习(ML)方法在预测任务中实现最优性能至关重要。尽管关于预测任务中ML学习器调优的研究已较为丰富,但在因果机器学习中如何调优ML学习器以及如何在不同ML学习器之间进行选择的指导却十分匮乏。本文基于Chernozhukov等人(2018)提出的双机器学习(DML)方法,通过实证评估ML方法的预测性能与由此产生的因果估计之间的关系。DML通过将所谓的干扰参数视为监督学习问题进行估计,并将其作为代入估计量来求解(因果)参数。我们利用2019年大西洋因果推断会议数据挑战赛的数据进行了广泛的模拟研究,提供了关于超参数调优及其他实际决策在基于DML的因果估计中所起作用的实证见解。首先,我们评估了双机器学习框架内用于调优ML学习器的数据划分方案的重要性。其次,我们研究了ML方法和超参数(包括最新的AutoML框架)的选择如何影响目标因果参数的估计性能。第三,我们评估了由所含参数假设表征的特定因果模型的选择在多大程度上可以基于预测性能指标。