Consider a predictor, a learner, whose input is a stream of discrete items. The predictor's task, at every time point, is probabilistic multiclass prediction, i.e., to predict which item may occur next by outputting zero or more candidate items, each with a probability, after which the actual item is revealed and the predictor learns from this observation. To output probabilities, the predictor keeps track of the proportions of the items it has seen. The predictor has constant (limited) space and we seek efficient prediction and update techniques: The stream is unbounded, the set of items is unknown to the predictor and their totality can also grow unbounded. Moreover, there is non-stationarity: the underlying frequencies of items may change, substantially, from time to time. For instance, new items may start appearing and a few currently frequent items may cease to occur again. The predictor, being space-bounded, need only provide probabilities for those items with (currently) sufficiently high frequency, i.e., the salient items. This problem is motivated in the setting of prediction games, a self-supervised learning regime where concepts serve as both the predictors and the predictands, and the set of concepts grows over time, resulting in non-stationarities as new concepts are generated and used. We develop moving average techniques designed to respond to such non-stationarities in a timely manner, and explore their properties. One is a simple technique based on queuing of count snapshots, and another is a combination of queuing together with an extended version of sparse EMA. The latter combination supports predictand-specific dynamic learning rates. We find that this flexibility allows for a more accurate and timely convergence.
翻译:考虑一个预测器(即学习者),其输入为离散项目的流数据。预测器在每个时间点的任务是进行概率多分类预测,即通过输出零个或多个候选项目(每个项目附带概率)来预测下一个可能出现的项目,随后真实项目被揭示,预测器从该观测中学习。为输出概率,预测器需持续跟踪已见项目的比例。预测器具有恒定(有限)空间,我们需要高效的预测与更新技术:流数据无界,项目集合对预测器未知且其总量可能无限增长。此外,存在非平稳性:项目的基础频率可能随时发生显著变化。例如,新项目可能开始出现,而当前频繁出现的某些项目可能不再出现。受空间限制,预测器仅需为当前频率足够高的项目(即显著项目)提供概率。该问题源于预测博弈场景——这是一种自监督学习范式,其中概念同时作为预测器与被预测对象,且概念集合随时间增长,导致新概念生成和使用时产生非平稳性。我们开发了旨在及时响应此类非平稳性的移动平均技术,并探索其特性。其一是基于计数快照队列的简单技术,另一是队列与稀疏EMA扩展版本的组合方法。后一种组合支持预测对象特异性的动态学习率。我们发现这种灵活性能够实现更准确且及时的收敛。