When estimating treatment effects, the golden standard is to conduct a randomized experiment and then contrast outcomes associated with the treatment group and the control group. However, in many cases, randomized experiments are either conducted with a much smaller scale compared to the size of the target population or accompanied with certain ethical issues and thus hard to implement. Therefore, researchers usually rely on observational data to study causal connections. The downside is that the unconfoundedness assumption, the key to validate the use of observational data is hard to verify and almost always violated. Hence, any conclusion drawn from observational data should be further analyzed with great care. Given the richness of observational data and usefulness of experimental data, researchers hope to develop credible method to combine the strength of the two. In this paper, we consider a setting where the observational data contain the outcome of interest as well as a surrogate outcome while the experimental data contain only the surrogate outcome. We propose a simple estimator to estimate the average treatment effect of interest using both the observational data and the experimental data.
翻译:在估计处理效应时,黄金标准是进行随机实验,然后对比处理组和对照组的结局。然而,在许多情况下,随机实验要么相对于目标人群规模而言规模小得多,要么伴随某些伦理问题而难以实施。因此,研究人员通常依赖观测数据来研究因果关系。其不足之处在于,验证观测数据使用有效性的关键假设——无混淆性——难以验证且几乎总是被违反。因此,基于观测数据得出的任何结论都应进一步谨慎分析。鉴于观测数据的丰富性和实验数据的实用性,研究人员希望开发可靠的方法来结合两者的优势。在本文中,我们考虑这样一种场景:观测数据包含感兴趣的结果以及一个替代结果,而实验数据仅包含替代结果。我们提出一种简单的估计量,利用观测数据和实验数据来估计感兴趣的平均处理效应。