Temporal distribution shifts pose a key challenge for machine learning models trained and deployed in dynamically evolving environments. This paper introduces RIDER (RIsk minimization under Dynamically Evolving Regimes) which derives optimally-weighted empirical risk minimization procedures under temporal distribution shifts. Our approach is theoretically grounded in the random distribution shift model, where random shifts arise as a superposition of numerous unpredictable changes in the data-generating process. We show that common weighting schemes, such as pooling all data, exponentially weighting data, and using only the most recent data, emerge naturally as special cases in our framework. We demonstrate that RIDER consistently improves out-of-sample predictive performance when applied as a fine-tuning step on the Yearbook dataset, across a range of benchmark methods in Wild-Time. Moreover, we show that RIDER outperforms standard weighting strategies in two other real-world tasks: predicting stock market volatility and forecasting ride durations in NYC taxi data.
翻译:时间分布漂移对在动态演化环境中训练和部署的机器学习模型构成了关键挑战。本文提出RIDER(动态演化机制下的风险最小化),该框架推导了时间分布漂移下的最优加权经验风险最小化方法。我们的方法在随机分布漂移模型上具有理论依据,其中随机漂移由数据生成过程中大量不可预测变化的叠加产生。我们证明,常见的加权方案(如汇集所有数据、指数加权数据以及仅使用最新数据)在我们的框架中自然地作为特例出现。通过在Yearbook数据集上作为微调步骤应用,我们证明RIDER在Wild-Time的一系列基准方法中持续提升了样本外预测性能。此外,我们展示RIDER在另外两个实际任务中优于标准加权策略:预测股市波动率与纽约市出租车行程时长预测。