Cost-sensitive retraining via posterior learning debt

Deployed prediction systems are often retrained on fixed calendars, even when model staleness and retraining burden vary over time. This short communication formulates retraining for Bayesian prediction systems as a cost-sensitive predictive-regret decision. The central monitoring state is posterior learning debt, defined as the Kullback--Leibler divergence from a reference shadow posterior to the deployed frozen posterior. In the decision layer, a retraining cost is compared with the expected one-period predictive regret of waiting. A continuous-severity version retrains when calibrated expected regret exceeds the retraining cost, while the familiar two-state excess-loss rule is a special case. The empirical study is an exact-state proof-of-concept in a synthetic conjugate simulation with warm-started deployed and shadow normal-inverse-gamma posteriors, separate update, monitoring, and evaluation batches, lagged deployment actions, expanded baseline grids, and score-unit sensitivity. Under the primary 75th-percentile score-unit scaling, an age-adjusted debt-threshold policy improves on tuned calendar retraining in all 72 non-stable scenario cells and on tuned CUSUM in 58 of 72 cells, with mean relative objectives 0.677 and 0.975, respectively. Debt-utility and hybrid-utility policies also improve strongly over tuned calendar retraining, but they do not dominate tuned CUSUM. Median and mean score-unit sensitivities show the same main calendar result, while the CUSUM comparison remains policy-dependent. The contribution is a transparent decision layer for deployed Bayesian prediction systems, not a universal replacement for drift detection.

翻译：部署后的预测系统通常按固定日历进行重训练，即使模型过时程度和重训练负担随时间变化。本文以短通讯形式将贝叶斯预测系统的重训练建模为成本敏感预测遗憾决策。核心监测状态为后验学习债，定义为参考影子后验与当前冻结后验之间的库尔贝克-莱布勒散度。在决策层中，将重训练成本与等待一期的预期预测遗憾进行比较。连续严重性版本在标定预期遗憾超过重训练成本时触发重训练，而常见的两状态超额损失规则为其特例。实证研究为精确状态概念验证，采用合成共轭仿真，包含热启动的冻结后验与影子正态逆伽玛后验、分离的更新/监测/评估批次、滞后部署动作、扩展基线网格及分数单位敏感性。在主要75分位分数单位缩放下，基于年龄调整的债阈值策略在所有72个非稳定场景单元中优于调优的日历重训练，在72个单元中的58个中优于调优的CUSUM，平均相对目标值分别为0.677和0.975。债效用和混合效用策略也显著优于调优的日历重训练，但未优于调优的CUSUM。中位数和均值分数单位敏感性在主日历结果上表现一致，而CUSUM比较则仍取决于策略。本文贡献在于为部署的贝叶斯预测系统提供透明决策层，而非漂移检测的通用替代方案。