This paper proposes a new way of evaluating the accuracy and validity of probabilistic forecasts that change over time (such as an in-game win probability model, or an election forecast). Under this approach, each model to be evaluated is treated as a canonical Kelly bettor, and the models are pitted against each other in an iterative betting contest. The growth or decline of each model's bankroll serves as the evaluation metric. Under this approach, market consensus probabilities and implied model credibilities can be updated real time as each model updates, and do not require one to wait for the final outcome. Using a simulation model, it will be shown that this method is in general more accurate than traditional average log-loss and Brier score methods at distinguishing a correct model from an incorrect model. This Kelly approach is shown to have a direct mathematical and conceptual analogue to Bayesian inference, with bankroll serving as a proxy for Bayesian credibility.
翻译:本文提出了一种评估随时间变化的概率预测(例如赛局胜率模型或选举预测)准确性与有效性的新方法。该方法将每个待评估模型视为一个标准凯利投注者,让模型在迭代投注竞赛中相互竞争。各模型资金账户的增长或衰减即作为评估指标。在此框架下,市场共识概率与隐含模型可信度可随各模型更新而实时调整,无需等待最终结果。通过仿真模型将证明,在区分正确模型与错误模型方面,此方法通常比传统的平均对数损失和Brier分数方法更为准确。研究还表明,该凯利方法在数学与概念上与贝叶斯推断存在直接对应关系,其中资金账户可视为贝叶斯可信度的代理指标。