Constructing confidence intervals (CIs) for the average treatment effect (ATE) from patient records is crucial to assess the effectiveness and safety of drugs. However, patient records typically come from different hospitals, thus raising the question of how multiple observational datasets can be effectively combined for this purpose. In our paper, we propose a new method that estimates the ATE from multiple observational datasets and provides valid CIs. Our method makes little assumptions about the observational datasets and is thus widely applicable in medical practice. The key idea of our method is that we leverage prediction-powered inferences and thereby essentially `shrink' the CIs so that we offer more precise uncertainty quantification as compared to na\"ive approaches. We further prove the unbiasedness of our method and the validity of our CIs. We confirm our theoretical results through various numerical experiments. Finally, we provide an extension of our method for constructing CIs from combinations of experimental and observational datasets.
翻译:从患者记录中构建平均处理效应(ATE)的置信区间(CIs),对于评估药物的有效性和安全性至关重要。然而,患者记录通常来自不同的医院,这引发了如何有效整合多个观察性数据集以达成此目的的问题。在本文中,我们提出了一种新方法,该方法可从多个观察性数据集中估计ATE并提供有效的CIs。我们的方法对观察性数据集的假设极少,因此在医疗实践中具有广泛的适用性。我们方法的核心思想在于利用预测驱动的推断,从而本质上“收缩”了置信区间,与简单方法相比,我们提供了更精确的不确定性量化。我们进一步证明了我们方法的无偏性以及所构建CIs的有效性。我们通过各种数值实验验证了理论结果。最后,我们提供了本方法的一个扩展,用于从实验性和观察性数据集的组合中构建置信区间。