In fixed budget bandit identification, an algorithm sequentially observes samples from several distributions up to a given final time. It then answers a query about the set of distributions. A good algorithm will have a small probability of error. While that probability decreases exponentially with the final time, the best attainable rate is not known precisely for most identification tasks. We show that if a fixed budget task admits a complexity, defined as a lower bound on the probability of error which is attained by a single algorithm on all bandit problems, then that complexity is determined by the best non-adaptive sampling procedure for that problem. We show that there is no such complexity for several fixed budget identification tasks including Bernoulli best arm identification with two arms: there is no single algorithm that attains everywhere the best possible rate.
翻译:在固定预算赌博识别中,算法在给定截止时间内顺序观测来自多个分布的样本,随后回答关于这些分布集合的查询。一个好的算法应具有较小的错误概率。尽管该错误概率随截止时间呈指数下降,但大多数识别任务中可达到的最优速率尚不明确。我们证明:若某项固定预算任务存在一个复杂性(定义为所有赌博问题上单个算法所达到的错误概率下界),则该复杂性由该问题的最优非自适应采样过程决定。我们进一步证明,包括带两个臂的伯努利最优臂识别在内的若干固定预算识别任务中不存在此类复杂性:没有任何单一算法能在所有问题上达到全局最优速率。