Machine learning models deployed in nonstationary environments inevitably experience performance degradation due to data drift. While numerous drift detection heuristics exist, most lack a dynamical interpretation and provide limited guidance on how retraining decisions should be balanced against operational cost. In this work, we propose an entropy-based retraining framework grounded in nonequilibrium statistical physics. Interpreting drift as probability flow governed by a Fokker-Planck equation, we quantify model-data mismatch using relative entropy and show that its time derivative admits an entropy-balance decomposition featuring a nonnegative entropy production term driven by probability currents. Guided by this theory, we implement an entropy-triggered retraining policy using an exponentially weighted moving-average (EWMA) control statistic applied to a streaming kernel density estimator of the Kullback-Leibler divergence. We evaluate this approach across multiple nonstationary data streams. In synthetic, financial, and web-traffic domains, entropy-based retraining achieves predictive performance comparable to frequent retraining while reducing retraining frequency by one to two orders of magnitude. However, in a challenging biomedical ECG setting, the entropy-based trigger underperforms the maximum-frequency baseline, highlighting limitations of feature-space entropy monitoring under complex label-conditional drift.
翻译:部署于非平稳环境中的机器学习模型,不可避免地会因数据漂移而经历性能退化。尽管存在众多漂移检测启发式方法,但大多缺乏动力学解释,且对如何平衡重训练决策与操作成本提供的指导有限。在本工作中,我们提出了一种基于非平衡统计物理的、以熵为基础的重训练框架。通过将漂移解释为由福克-普朗克方程支配的概率流,我们使用相对熵量化模型与数据之间的失配,并证明其时间导数允许一种熵平衡分解,该分解包含一个由概率流驱动的非负熵产生项。在此理论指导下,我们实现了一种熵触发的重训练策略,该策略将指数加权移动平均控制统计量应用于库尔巴克-莱布勒散度的流式核密度估计器。我们在多个非平稳数据流上评估了该方法。在合成、金融和网络流量领域中,基于熵的重训练实现了与频繁重训练相当的预测性能,同时将重训练频率降低了一到两个数量级。然而,在一个具有挑战性的生物医学心电图场景中,基于熵的触发机制表现逊于最大频率基线,这突显了在复杂标签条件漂移下特征空间熵监测的局限性。