Constructing Confidence Intervals for Average Treatment Effects from Multiple Datasets

Constructing confidence intervals (CIs) for the average treatment effect (ATE) from patient records is crucial to assess the effectiveness and safety of drugs. However, patient records typically come from different hospitals, thus raising the question of how multiple observational datasets can be effectively combined for this purpose. In our paper, we propose a new method that estimates the ATE from multiple observational datasets and provides valid CIs. Our method makes little assumptions about the observational datasets and is thus widely applicable in medical practice. The key idea of our method is that we leverage prediction-powered inferences and thereby essentially `shrink' the CIs so that we offer more precise uncertainty quantification as compared to na\"ive approaches. We further prove the unbiasedness of our method and the validity of our CIs. We confirm our theoretical results through various numerical experiments. Finally, we provide an extension of our method for constructing CIs from combinations of experimental and observational datasets.

翻译：从患者记录中构建平均处理效应（ATE）的置信区间（CIs），对于评估药物的有效性和安全性至关重要。然而，患者记录通常来自不同的医院，这引发了如何有效整合多个观察性数据集以达成此目的的问题。在本文中，我们提出了一种新方法，该方法可从多个观察性数据集中估计ATE并提供有效的CIs。我们的方法对观察性数据集的假设极少，因此在医疗实践中具有广泛的适用性。我们方法的核心思想在于利用预测驱动的推断，从而本质上“收缩”了置信区间，与简单方法相比，我们提供了更精确的不确定性量化。我们进一步证明了我们方法的无偏性以及所构建CIs的有效性。我们通过各种数值实验验证了理论结果。最后，我们提供了本方法的一个扩展，用于从实验性和观察性数据集的组合中构建置信区间。

相关内容

数据集

关注 88

数据集，又称为资料集、数据集合或资料集合，是一种由数据所组成的集合。
Data set（或dataset）是一个数据的集合，通常以表格形式出现。每一列代表一个特定变量。每一行都对应于某一成员的数据集的问题。它列出的价值观为每一个变量，如身高和体重的一个物体或价值的随机数。每个数值被称为数据资料。对应于行数，该数据集的数据可能包括一个或多个成员。

O’Reilly报告：知识图谱崛起——面向现代数据集成和数据结构体系，“The Rise of the Knowledge Graph——Toward Modern Data Integration and the Data Fabric Architecture”

专知会员服务

49+阅读 · 2022年2月18日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日