In fixed budget bandit identification, an algorithm sequentially observes samples from several distributions up to a given final time. It then answers a query about the set of distributions. A good algorithm will have a small probability of error. While that probability decreases exponentially with the final time, the best attainable rate is not known precisely for most identification tasks. We show that if a fixed budget task admits a complexity, defined as a lower bound on the probability of error which is attained by the same algorithm on all bandit problems, then that complexity is determined by the best non-adaptive sampling procedure for that problem. We show that there is no such complexity for several fixed budget identification tasks including Bernoulli best arm identification with two arms: there is no single algorithm that attains everywhere the best possible rate.
翻译:在固定预算赌博机识别问题中,算法在给定最终时间之前依次观测多个分布的样本,随后对分布集合的相关查询做出回答。一个优秀的算法应具有较小的错误概率。尽管该错误概率随最终时间呈指数级下降,但大多数识别任务的最佳可达率并不精确可知。我们证明:若某个固定预算任务存在一个复杂性(定义为由同一算法在所有赌博机问题上都能达到的错误概率下界),则该复杂性由该问题的最优非自适应采样过程决定。我们进一步证明,对于若干固定预算识别任务(包括双臂伯努利最优臂识别问题),并不存在这样的复杂性:没有任何单一算法能在所有情况下达到最佳可达率。