Out of the participants in a randomized experiment with anticipated heterogeneous treatment effects, is it possible to identify which subjects have a positive treatment effect? While subgroup analysis has received attention, claims about individual participants are much more challenging. We frame the problem in terms of multiple hypothesis testing: each individual has a null hypothesis (stating that the potential outcomes are equal, for example) and we aim to identify those for whom the null is false (the treatment potential outcome stochastically dominates the control one, for example). We develop a novel algorithm that identifies such a subset, with nonasymptotic control of the false discovery rate (FDR). Our algorithm allows for interaction -- a human data scientist (or a computer program) may adaptively guide the algorithm in a data-dependent manner to gain power. We show how to extend the methods to observational settings and achieve a type of doubly-robust FDR control. We also propose several extensions: (a) relaxing the null to nonpositive effects, (b) moving from unpaired to paired samples, and (c) subgroup identification. We demonstrate via numerical experiments and theoretical analysis that the proposed method has valid FDR control in finite samples and reasonably high identification power.
翻译:在具有预期异质性治疗效果的随机实验中,能否识别出具有正向治疗效果的受试者?尽管亚组分析已受到关注,但针对个体参与者的论断更具挑战性。我们将该问题框架化为多重假设检验:每个个体都有一个零假设(例如,假设潜在结果相等),我们的目标是识别那些零假设为假的个体(例如,治疗潜在结果随机优于对照组潜在结果)。我们开发了一种新颖算法,能在非渐近条件下控制错误发现率(FDR)的同时识别此类子集。该算法支持交互式操作——人类数据科学家(或计算机程序)可根据数据自适应地引导算法以提升统计功效。我们展示了如何将该方法扩展到观察性研究场景,并实现一种双重稳健的FDR控制。此外,我们提出了多项扩展:(a)将零假设放宽至非正向效应,(b)从非配对样本推广至配对样本,(c)亚组识别。通过数值实验和理论分析,我们证明所提方法在有限样本下具有有效的FDR控制能力和较高的识别功效。