In good arm identification (GAI), the goal is to identify one arm whose average performance exceeds a given threshold, referred to as a good arm, if it exists. Few works have studied GAI in the fixed-budget setting when the sampling budget is fixed beforehand, or in the anytime setting, when a recommendation can be asked at any time. We propose APGAI, an anytime and parameter-free sampling rule for GAI in stochastic bandits. APGAI can be straightforwardly used in fixed-confidence and fixed-budget settings. First, we derive upper bounds on its probability of error at any time. They show that adaptive strategies can be more efficient in detecting the absence of good arms than uniform sampling in several diverse instances. Second, when APGAI is combined with a stopping rule, we prove upper bounds on the expected sampling complexity, holding at any confidence level. Finally, we show the good empirical performance of APGAI on synthetic and real-world data. Our work offers an extensive overview of the GAI problem in all settings.
翻译:在优良臂识别(GAI)问题中,目标是在存在时识别出一个平均性能超过给定阈值(称为优良臂)的臂。先前研究很少探讨固定预算设定(即采样预算预先固定)或随时设定(即可在任意时刻请求推荐)下的GAI问题。本文提出APGAI,一种用于随机多臂赌博机中GAI问题的随时且无参数采样规则。APGAI可直接应用于固定置信度与固定预算设定。首先,我们推导了其在任意时刻的误差概率上界。这些界表明,在多种不同场景中,自适应策略在检测优良臂缺失时可能比均匀采样更高效。其次,当APGAI与停止规则结合时,我们证明了其在任意置信水平下均成立的期望采样复杂度上界。最后,我们在合成数据与真实数据上展示了APGAI的良好实证性能。本研究为所有设定下的GAI问题提供了全面综述。