Recent developments in causal machine learning methods have made it easier to estimate flexible relationships between confounders, treatments and outcomes, making unconfoundedness assumptions in causal analysis more palatable. How successful are these approaches in recovering ground truth baselines? In this paper we analyze a new data sample including an experimental rollout of a new feature at a large technology company and a simultaneous sample of users who endogenously opted into the feature. We find that recovering ground truth causal effects is feasible -- but only with careful modeling choices. Our results build on the observational causal literature beginning with LaLonde (1986), offering best practices for more credible treatment effect estimation in modern, high-dimensional datasets.
翻译:近期因果机器学习方法的发展使得估计混杂因素、处理变量与结果之间灵活关系变得更加容易,从而使得因果分析中的无混杂假设更为可行。这些方法在恢复真实基准方面效果如何?本文通过分析一个大型科技公司新功能实验性推广期间的数据样本,以及同时期用户自主选择使用该功能的样本,对此展开研究。我们发现,恢复真实的因果效应是可行的——但前提是必须采用精细的建模策略。我们的研究结果建立在自LaLonde(1986)以来的观测性因果文献基础之上,为在现代高维数据集中进行更可靠的处理效应估计提供了最佳实践指南。