In many forecasting settings, there is a specific interest in predicting the sign of an outcome variable correctly in addition to its magnitude. For instance, when forecasting armed conflicts, positive and negative log-changes in monthly fatalities represent escalation and de-escalation, respectively, and have very different implications. In the ViEWS forecasting challenge, a prediction competition on state-based violence, a novel evaluation score called targeted absolute deviation with direction augmentation (TADDA) has therefore been suggested, which accounts for both for the sign and magnitude of log-changes. While it has a straightforward intuitive motivation, the empirical results of the challenge show that a no-change model always predicting a log-change of zero outperforms all submitted forecasting models under the TADDA score. We provide a statistical explanation for this phenomenon. Analyzing the properties of TADDA, we find that in order to achieve good scores, forecasters often have an incentive to predict no or only modest log-changes. In particular, there is often an incentive to report conservative point predictions considerably closer to zero than the forecaster's actual predictive median or mean. In an empirical application, we demonstrate that a no-change model can be improved upon by tailoring predictions to the particularities of the TADDA score. We conclude by outlining some alternative scoring concepts.
翻译:在许多预测场景中,除了预测结果变量的幅度外,正确预测其符号也具有特殊意义。例如,在预测武装冲突时,每月伤亡人数的正对数变化代表冲突升级,负对数变化代表冲突降级,二者具有截然不同的含义。在基于国家暴力的预测竞赛ViEWS挑战赛中,研究者提出了一种名为"带有方向增强的目标绝对偏差"(TADDA)的新型评估指标,该指标同时考虑了对数变化的符号与幅度。尽管该指标具有直观的动机,但挑战赛的实证结果表明:在TADDA评分下,始终预测零对数变化的不变模型优于所有提交的预测模型。我们为这一现象提供了统计学解释。通过分析TADDA的特性,我们发现为获得良好评分,预测者往往倾向于预测零或微小的对数变化。特别地,预测者通常有动机报告比其实际预测中位数或均值更接近零的保守点预测。在实证应用中,我们证明了通过针对TADDA评分的特性调整预测,可以改进不变模型的表现。最后,我们概述了若干替代性评分方案。