In online binary classification under \emph{apple tasting} feedback, the learner only observes the true label if it predicts ``1". First studied by \cite{helmbold2000apple}, we revisit this classical partial-feedback setting and study online learnability from a combinatorial perspective. We show that the Littlestone dimension continues to provide a tight quantitative characterization of apple tasting in the agnostic setting, closing an open question posed by \cite{helmbold2000apple}. In addition, we give a new combinatorial parameter, called the Effective width, that tightly quantifies the minimax expected mistakes in the realizable setting. As a corollary, we use the Effective width to establish a \emph{trichotomy} of the minimax expected number of mistakes in the realizable setting. In particular, we show that in the realizable setting, the expected number of mistakes of any learner, under apple tasting feedback, can be $\Theta(1), \Theta(\sqrt{T})$, or $\Theta(T)$. This is in contrast to the full-information realizable setting where only $\Theta(1)$ and $\Theta(T)$ are possible.
翻译:在具有 \emph{苹果品尝} 反馈的在线二分类问题中,学习器仅在预测为“1”时才能观测到真实标签。该经典部分反馈设定最初由 \cite{helmbold2000apple} 研究,本文从组合视角重新审视这一设定,并研究其在线可学习性。我们证明,在不可知论设定下,Littlestone 维度继续为苹果品尝问题提供紧的定量刻画,从而解决了 \cite{helmbold2000apple} 提出的一个开放性问题。此外,我们引入一个新的组合参数,称为有效宽度,它紧致地量化了可实现设定下的极小极大期望错误数。作为推论,我们利用有效宽度建立了可实现设定下极小极大期望错误数的 \emph{三分律}。具体而言,我们证明在可实现设定下,任何学习器在苹果品尝反馈下的期望错误数可以是 $\Theta(1)$、$\Theta(\sqrt{T})$ 或 $\Theta(T)$。这与全信息可实现设定下仅可能出现 $\Theta(1)$ 和 $\Theta(T)$ 的情况形成对比。