Adaptive experiment is widely adopted to estimate conditional average treatment effect (CATE) in clinical trials and many other scenarios. While the primary goal in experiment is to maximize estimation accuracy, due to the imperative of social welfare, it's also crucial to provide treatment with superior outcomes to patients, which is measured by regret in contextual bandit framework. These two objectives often lead to contrast optimal allocation mechanism. Furthermore, privacy concerns arise in clinical scenarios containing sensitive data like patients health records. Therefore, it's essential for the treatment allocation mechanism to incorporate robust privacy protection measures. In this paper, we investigate the tradeoff between loss of social welfare and statistical power in contextual bandit experiment. We propose a matched upper and lower bound for the multi-objective optimization problem, and then adopt the concept of Pareto optimality to mathematically characterize the optimality condition. Furthermore, we propose differentially private algorithms which still matches the lower bound, showing that privacy is "almost free". Additionally, we derive the asymptotic normality of the estimator, which is essential in statistical inference and hypothesis testing.
翻译:自适应实验广泛用于临床试验及许多其他场景中估计条件平均处理效应(CATE)。实验的首要目标是最大化估计精度,但由于社会福利的必要性,为患者提供具有更优结果的处理同样至关重要——这在上下文多臂赌博机框架中通过遗憾值来衡量。这两个目标往往导致对比最优分配机制。此外,涉及敏感数据(如患者健康记录)的临床场景会引发隐私问题。因此,处理分配机制必须纳入强有力的隐私保护措施。本文研究了上下文多臂赌博机实验中社会福利损失与统计功效之间的权衡。我们提出了多目标优化问题的匹配上下界,随后采用帕累托最优性概念从数学上刻画最优条件。此外,我们提出的差分隐私算法仍能匹配该下界,表明隐私保护"几乎无成本"。同时推导了估计量的渐近正态性,这对统计推断和假设检验至关重要。