In online binary classification under \textit{apple tasting} feedback, the learner only observes the true label if it predicts "1". First studied by \cite{helmbold2000apple}, we revisit this classical partial-feedback setting and study online learnability from a combinatorial perspective. We show that the Littlestone dimension continues to prove a tight quantitative characterization of apple tasting in the agnostic setting, closing an open question posed by \cite{helmbold2000apple}. In addition, we give a new combinatorial parameter, called the Effective width, that tightly quantifies the minimax expected mistakes in the realizable setting. As a corollary, we use the Effective width to establish a \textit{trichotomy} of the minimax expected number of mistakes in the realizable setting. In particular, we show that in the realizable setting, the expected number of mistakes for any learner under apple tasting feedback can only be $\Theta(1), \Theta(\sqrt{T})$, or $\Theta(T)$.
翻译:在在线二分类的\textit{苹果品尝}反馈设置中,学习器仅在预测标签为“1”时才能观察到真实标签。该问题最初由\cite{helmbold2000apple}研究,我们重新审视这一经典部分反馈设置,并从组合角度研究在线可学习性。我们证明,Littlestone维度在不可知场景下仍能对苹果品尝问题提供严格的量化刻画,从而解答了\cite{helmbold2000apple}提出的开放问题。此外,我们提出一个新的组合参数——有效宽度(Effective width),它能够严格量化可实现场景下的极小化极大期望错误数。作为推论,我们利用有效宽度在可实现场景下建立了极小化极大期望错误数的\textit{三分律}。具体而言,我们证明在可实现场景下,任何学习器在苹果品尝反馈下的期望错误数仅可能为$\Theta(1)$、$\Theta(\sqrt{T})$或$\Theta(T)$。