We study a new learning protocol, termed partial-feedback online learning, where each instance admits a set of acceptable labels, but the learner observes only one acceptable label per round. We highlight that, while classical version space is widely used for online learnability, it does not directly extend to this setting. We address this obstacle by introducing a collection version space, which maintains sets of hypotheses rather than individual hypotheses. Using this tool, we obtain a tight characterization of learnability in the set-realizable regime. In particular, we define the Partial-Feedback Littlestone dimension (PFLdim) and the Partial-Feedback Measure Shattering dimension (PMSdim), and show that they tightly characterize the minimax regret for deterministic and randomized learners, respectively. We further identify a nested inclusion condition under which deterministic and randomized learnability coincide, resolving an open question of Raman et al. (2024b). Finally, given a hypothesis space H, we show that beyond set realizability, the minimax regret can be linear even when |H|=2, highlighting a barrier beyond set realizability.
翻译:我们研究一种新的学习协议,称为部分反馈在线学习,其中每个实例对应一组可接受的标签,但学习者在每轮中仅能观测到一个可接受的标签。我们强调,尽管经典版本空间广泛用于在线可学习性研究,但它无法直接扩展至该场景。为解决这一障碍,我们引入了一种集合版本空间,该空间维护假设集合而非单个假设。利用这一工具,我们在集合可实现情形下获得了可学习性的紧致刻画。具体而言,我们定义了部分反馈利特尔斯通维度(PFLdim)和部分反馈测度粉碎维度(PMSdim),并证明它们分别紧致刻画了确定性和随机学习者的极小化极大遗憾。我们进一步确定了一个嵌套包含条件,在该条件下确定性与随机可学习性一致,解决了Raman等人(2024b)提出的一个开放性问题。最后,对于给定假设空间H,我们证明在集合可实现性之外,即便当|H|=2时,极小化极大遗憾也可能是线性的,这揭示了超越集合可实现性的一个壁垒。