We investigate online classification with paid stochastic experts. Here, before making their prediction, each expert must be paid. The amount that we pay each expert directly influences the accuracy of their prediction through some unknown Lipschitz "productivity" function. In each round, the learner must decide how much to pay each expert and then make a prediction. They incur a cost equal to a weighted sum of the prediction error and upfront payments for all experts. We introduce an online learning algorithm whose total cost after $T$ rounds exceeds that of a predictor which knows the productivity of all experts in advance by at most $\mathcal{O}(K^2(\log T)\sqrt{T})$ where $K$ is the number of experts. In order to achieve this result, we combine Lipschitz bandits and online classification with surrogate losses. These tools allow us to improve upon the bound of order $T^{2/3}$ one would obtain in the standard Lipschitz bandit setting. Our algorithm is empirically evaluated on synthetic data
翻译:我们研究了付费随机专家在线分类问题。在此问题中,每个专家在做出预测前必须获得支付。我们支付给每位专家的金额通过某个未知的Lipschitz“生产力”函数直接影响其预测准确性。在每一轮中,学习者需决定支付每位专家的金额,并做出预测。其产生的总成本等于预测误差的加权和与所有专家的预付费用之和。我们提出一种在线学习算法,其运行$T$轮后的总成本最多超出预知所有专家生产力的预测器的总成本$\mathcal{O}(K^2(\log T)\sqrt{T})$,其中$K$为专家数量。为获得此结果,我们结合了Lipschitz老虎机与基于替代损失的在线分类方法。这些工具使我们能够将标准Lipschitz老虎机场景中原本需达到的$T^{2/3}$量级界改进至当前结果。该算法已在合成数据上进行了实证评估。