We study a variant of online multiclass classification where the learner predicts a single label but receives a \textit{set of labels} as feedback. In this model, the learner is penalized for not outputting a label contained in the revealed set. We show that unlike online multiclass learning with single-label feedback, deterministic and randomized online learnability are \textit{not equivalent} even in the realizable setting with set-valued feedback. Accordingly, we give two new combinatorial dimensions, named the Set Littlestone and Measure Shattering dimension, that tightly characterize deterministic and randomized online learnability respectively in the realizable setting. In addition, we show that the Measure Shattering dimension characterizes online learnability in the agnostic setting and tightly quantifies the minimax regret. Finally, we use our results to establish bounds on the minimax regret for three practical learning settings: online multilabel ranking, online multilabel classification, and real-valued prediction with interval-valued response.
翻译:本文研究在线多类分类的一种变体,其中学习者预测单个标签,但接收到的反馈是一个\textit{标签集合}。在此模型中,若学习者输出的标签未包含在反馈揭示的集合中,则会受到惩罚。我们证明,与接收单标签反馈的在线多类学习不同,即使在可实现性设定下,确定性在线可学习性与随机性在线可学习性在集合值反馈下\textit{并不等价}。据此,我们提出了两个新的组合维度,分别命名为集合Littlestone维度和测度打散维度,它们分别在可实现性设定下严格刻画了确定性和随机性在线可学习性。此外,我们证明测度打散维度能够刻画不可知设定下的在线可学习性,并严格量化极小极大遗憾。最后,我们利用所得结果为三种实际学习场景建立了极小极大遗憾的界:在线多标签排序、在线多标签分类以及区间值响应下的实值预测。