We study a new learning protocol, termed partial-feedback online learning, where each instance admits a set of acceptable labels, but the learner observes only one acceptable label per round. We highlight that, while classical version space is widely used for online learnability, it does not directly extend to this setting. We address this obstacle by introducing a collection version space, which maintains sets of hypotheses rather than individual hypotheses. Using this tool, we obtain a tight characterization of learnability in the set-realizable regime. In particular, we define the Partial-Feedback Littlestone dimension (PFLdim) and the Partial-Feedback Measure Shattering dimension (PMSdim), and show that they tightly characterize the minimax regret for deterministic and randomized learners, respectively. We further identify a nested inclusion condition under which deterministic and randomized learnability coincide, resolving an open question of Raman et al. (2024b). Finally, given a hypothesis space H, we show that beyond set realizability, the minimax regret can be linear even when |H|=2, highlighting a barrier beyond set realizability.
翻译:本文研究一种新的学习协议,称为部分反馈在线学习,其中每个实例允许存在一组可接受标签,但学习者在每轮只能观察到一个可接受标签。我们指出,虽然经典版本空间被广泛用于在线可学习性分析,但它无法直接推广到这一设定。为此,我们通过引入集合版本空间来解决这一障碍,该空间维护的是假设集合而非单个假设。利用这一工具,我们在集合可实现的机制下获得了可学习性的紧致刻画。具体而言,我们定义了部分反馈利特尔斯通维度(PFLdim)与部分反馈测度粉碎维度(PMSdim),并证明它们分别紧致刻画了确定性学习器与随机化学习器的最小最大遗憾。我们进一步确定了一种嵌套包含条件,在该条件下确定性可学习性与随机化可学习性等价,从而解决了Raman等人(2024b)提出的一个开放性问题。最后,对于给定的假设空间H,我们证明了在集合可实现性条件之外,即使当|H|=2时最小最大遗憾也可能呈线性增长,这揭示了超越集合可实现性的一层障碍。