Randomized control trials, RCTs, have become a powerful tool for assessing the impact of interventions and policies in many contexts. They are considered the gold-standard for inference in the biomedical fields and in many social sciences. Researchers have published an increasing number of studies that rely on RCTs for at least part of the inference, and these studies typically include the response data collected, de-identified and sometimes protected through traditional disclosure limitation methods. In this paper, we empirically assess the impact of strong privacy-preservation methodology (with \ac{DP} guarantees), on published analyses from RCTs, leveraging the availability of replication packages (research compendia) in economics and policy analysis. We provide simulations studies and demonstrate how we can replicate the analysis in a published economics article on privacy-protected data under various parametrizations. We find that relatively straightforward DP-based methods allow for inference-valid protection of the published data, though computational issues may limit more complex analyses from using these methods. The results have applicability to researchers wishing to share RCT data, especially in the context of low- and middle-income countries, with strong privacy protection.
翻译:随机对照试验(RCTs)已成为评估多种情境下干预措施和政策影响的有力工具。它们被视为生物医学领域及众多社会科学中的推断黄金标准。研究者们发表了越来越多至少部分依赖RCT进行推断的研究,这些研究通常包含经过去标识化处理、有时通过传统披露限制方法保护的响应数据。本文基于经济学和政策分析中可获取的复制包(研究汇编),通过实证评估强隐私保护方法论(具有差分隐私保障)对RCT已发表分析的影响。我们开展模拟研究,并展示如何在多种参数化设定下,利用已发表经济学文章中的分析过程对隐私保护数据实施复现。研究发现,相对直接的基于DP的方法能够为已发布数据提供推断有效的保护,尽管计算问题可能限制更复杂分析对这些方法的使用。研究结果对于希望在强隐私保护背景下共享RCT数据(尤其是中低收入国家)的研究人员具有适用性。