In fixed-confidence best arm identification (BAI), the objective is to quickly identify the optimal option while controlling the probability of error below a desired threshold. Despite the plethora of BAI algorithms, existing methods typically fall short in practical settings, as stringent exact error control requires using loose tail inequalities and/or parametric restrictions. To overcome these limitations, we introduce a relaxed formulation that requires valid error control asymptotically with respect to a minimum sample size. This aligns with many real-world settings that often involve weak signals, high desired significance, and post-experiment inference requirements, all of which necessitate long horizons. This allows us to achieve tighter optimality, while better handling flexible nonparametric outcome distributions and fully leveraging individual-level contexts. We develop a novel asymptotic anytime-valid confidence sequences over arm indices, and we use it to design a new BAI algorithm for our asymptotic framework. Our method flexibly incorporates covariates for variance reduction and ensures approximate error control in fully nonparametric settings. Under mild convergence assumptions, we provide asymptotic bounds on the sample complexity and show the worst-case sample complexity of our approach matches the best-case sample complexity of Gaussian BAI under exact error guarantees and known variances. Experiments suggest our approach reduces average sample complexities while maintaining error control.
翻译:在固定置信度的最优臂识别问题中,目标是在将错误概率控制在期望阈值以下的同时快速识别最优选项。尽管已有大量BAI算法,现有方法在实际场景中通常表现不足,因为严格的精确误差控制需要使用宽松的尾界不等式和/或参数限制。为克服这些局限,我们提出了一种松弛的数学框架,该框架仅要求相对于最小样本量具有渐近有效的误差控制。这与许多现实场景相契合——这些场景通常涉及微弱信号、高显著性要求及实验后推断需求,所有这些都需要较长的实验周期。这使得我们能够实现更紧密的最优性,同时更好地处理灵活的非参数结果分布,并充分利用个体层面的上下文信息。我们开发了一种基于臂索引的渐近任意时间有效置信序列新方法,并利用其为我们提出的渐近框架设计了一种新的BAI算法。该方法灵活整合协变量以实现方差缩减,并在完全非参数设定下确保近似误差控制。在温和的收敛假设下,我们给出了样本复杂度的渐近界,并证明该方法在最坏情况下的样本复杂度与已知方差条件下具有精确误差保证的高斯BAI问题在最佳情况下的样本复杂度相匹配。实验表明,我们的方法在保持误差控制的同时降低了平均样本复杂度。