We are interested in estimating the effect of a treatment applied to individuals at multiple sites, where data is stored locally for each site. Due to privacy constraints, individual-level data cannot be shared across sites; the sites may also have heterogeneous populations and treatment assignment mechanisms. Motivated by these considerations, we develop federated methods to draw inference on the average treatment effects of combined data across sites. Our methods first compute summary statistics locally using propensity scores and then aggregate these statistics across sites to obtain point and variance estimators of average treatment effects. We show that these estimators are consistent and asymptotically normal. To achieve these asymptotic properties, we find that the aggregation schemes need to account for the heterogeneity in treatment assignments and in outcomes across sites. We demonstrate the validity of our federated methods through a comparative study of two large medical claims databases.
翻译:我们旨在估计多站点中个体接受干预后的效应,其中各站点数据本地存储。由于隐私限制,个体级数据无法跨站点共享;且站点间可能存在异质性人群及不同的干预分配机制。基于这些考量,我们开发了联邦式方法,用于推断跨站点合并数据的平均干预效应。该方法首先利用倾向性得分在本地计算汇总统计量,随后跨站点聚合这些统计量,以获取平均干预效应的点估计与方差估计。我们证明这些估计量具有一致性和渐近正态性。为达成这些渐近性质,我们发现聚合方案需考虑站点间干预分配与结局的异质性。通过两项大型医疗理赔数据库的比较研究,我们验证了联邦方法有效性。