Tracking Changing Probabilities via Dynamic Learners

Consider a predictor, a learner, whose input is a stream of discrete items. The predictor's task, at every time point, is probabilistic multiclass prediction, i.e., to predict which item may occur next by outputting zero or more candidate items, each with a probability, after which the actual item is revealed and the predictor learns from this observation. To output probabilities, the predictor keeps track of the proportions of the items it has seen. The stream is unbounded and the predictor has finite limited space and we seek efficient prediction and update techniques: the set of items is unknown to the predictor and their totality can also grow unbounded. Moreover, there is non-stationarity: the underlying frequencies of items may change, substantially, from time to time. For instance, new items may start appearing and a few recently frequent items may cease to occur again. The predictor, being space-bounded, need only provide probabilities for those items with (currently) sufficiently high frequency, i.e., the salient items. This problem is motivated in the setting of prediction games, a self-supervised learning regime where concepts serve as both the predictors and the predictands, and the set of concepts grows over time, resulting in non-stationarities as new concepts are generated and used. We develop sparse multiclass moving average techniques designed to respond to such non-stationarities in a timely manner. One technique is based on the exponentiated moving average (EMA) and another is based on queuing a few count snapshots. We show that the combination, and in particular supporting dynamic predictand-specific learning rates, offers advantages in terms of faster change detection and convergence.

翻译：考虑一个预测器（即学习器），其输入为离散项目的流。该预测器在每个时间点的任务是进行概率多类预测：即通过输出零个或多个候选项目（每个项目附带概率）来预测下一个可能出现的项目，随后真实项目被揭示，预测器从该观察中学习。为了输出概率，预测器需追踪已观察项目的比例。由于流是无界的且预测器具有有限的空间限制，我们需寻求高效的预测与更新技术：项目集合对预测器而言未知，且其总量也可能无界增长。此外，存在非平稳性：项目的潜在频率可能随时发生显著变化。例如，新项目可能开始出现，而近期频繁出现的少数项目可能不再出现。受限于空间，预测器只需为当前频率足够高的项目（即显著项目）提供概率。该问题源于预测游戏场景——一种自监督学习范式，其中概念同时作为预测器与预测目标，且概念集合随时间增长，导致新概念生成与使用引发的非平稳性。我们开发了稀疏多类移动平均技术，旨在及时响应此类非平稳性。一种技术基于指数移动平均（EMA），另一种基于对少量计数快照进行排队。实验证明，两者的组合，特别是支持动态预测目标特定学习率，可在更快的变化检测与收敛方面提供优势。