We propose a new variant of online learning that we call "ambiguous online learning". In this setting, the learner is allowed to produce multiple predicted labels. Such an "ambiguous prediction" is considered correct when at least one of the labels is correct, and none of the labels are "predictably wrong". The definition of "predictably wrong" comes from a hypothesis class in which hypotheses are also multi-valued. Thus, a prediction is "predictably wrong" if it's not allowed by the (unknown) true hypothesis. In particular, this setting is natural in the context of multivalued dynamical systems, recommendation algorithms and lossless compression. It is also strongly related to so-called "apple tasting". We show that in this setting, there is a trichotomy of mistake bounds: up to logarithmic factors, any hypothesis class has an optimal mistake bound of either Theta(1), Theta(sqrt(N)) or N.
翻译:我们提出了一种新型的在线学习变体,称为“模糊在线学习”。在此设定中,学习者被允许生成多个预测标签。当至少有一个标签正确,且没有任何标签是“可预测错误”时,这种“模糊预测”即被视为正确。“可预测错误”的定义源于一个假设类,其中的假设也是多值的。因此,若某个预测不被(未知的)真实假设所允许,则它是“可预测错误”的。特别地,这种设定在多值动态系统、推荐算法和无损压缩的背景下非常自然。它也与所谓的“苹果品尝”问题密切相关。我们证明,在此设定下,错误界限存在三分律:在对数因子范围内,任何假设类的最优错误界限要么是Theta(1)、Theta(sqrt(N)),要么是N。