Adaptive experiment is widely adopted to estimate conditional average treatment effect (CATE) in clinical trials and many other scenarios. While the primary goal in experiment is to maximize estimation accuracy, due to the imperative of social welfare, it's also crucial to provide treatment with superior outcomes to patients, which is measured by regret in contextual bandit framework. These two objectives often lead to contrast optimal allocation mechanism. Furthermore, privacy concerns arise in clinical scenarios containing sensitive data like patients health records. Therefore, it's essential for the treatment allocation mechanism to incorporate robust privacy protection measures. In this paper, we investigate the tradeoff between loss of social welfare and statistical power in contextual bandit experiment. We propose a matched upper and lower bound for the multi-objective optimization problem, and then adopt the concept of Pareto optimality to mathematically characterize the optimality condition. Furthermore, we propose differentially private algorithms which still matches the lower bound, showing that privacy is "almost free". Additionally, we derive the asymptotic normality of the estimator, which is essential in statistical inference and hypothesis testing.
翻译:自适应实验广泛应用于临床试验等多种场景中,以估计条件平均处理效应(CATE)。实验的首要目标是最大化估计精度,但出于社会福利的考量,为患者提供更优治疗结果也至关重要,这通过上下文强盗框架中的遗憾值来衡量。这两个目标往往导致对比最优分配机制。此外,在包含患者健康记录等敏感数据的临床场景中,隐私问题随之产生。因此,治疗分配机制必须纳入稳健的隐私保护措施。本文研究了上下文强盗实验中社会福利损失与统计功效之间的权衡关系。我们为多目标优化问题提出了匹配的上界与下界,并采用帕累托最优性的概念在数学上刻画最优条件。此外,我们提出了仍能达到下界的差分隐私算法,表明隐私保护"几乎免费"。同时,我们推导了估计量的渐近正态性,这对统计推断与假设检验至关重要。