To form precipitation datasets that are accurate and, at the same time, have high spatial densities, data from satellites and gauges are often merged in the literature. However, uncertainty estimates for the data acquired in this manner are scarcely provided, although the importance of uncertainty quantification in predictive modelling is widely recognized. Furthermore, the benefits that machine learning can bring to the task of providing such estimates have not been broadly realized and properly explored through benchmark experiments. The present study aims at filling in this specific gap by conducting the first benchmark tests on the topic. On a large dataset that comprises 15-year-long monthly data spanning across the contiguous United States, we extensively compared six learners that are, by their construction, appropriate for predictive uncertainty quantification. These are the quantile regression (QR), quantile regression forests (QRF), generalized random forests (GRF), gradient boosting machines (GBM), light gradient boosting machines (LightGBM) and quantile regression neural networks (QRNN). The comparison referred to the competence of the learners in issuing predictive quantiles at nine levels that facilitate a good approximation of the entire predictive probability distribution, and was primarily based on the quantile and continuous ranked probability skill scores. Three types of predictor variables (i.e., satellite precipitation variables, distances between a point of interest and satellite grid points, and elevation at a point of interest) were used in the comparison and were additionally compared with each other. This additional comparison was based on the explainable machine learning concept of feature importance. The results suggest that the order from the best to the worst of the learners for the task investigated is the following: LightGBM, QRF, GRF, GBM, QRNN and QR...
翻译:为生成既精确又具有高空间密度的降水数据集,文献中常将卫星与雨量计数据进行融合。然而,尽管不确定性量化在预测建模中的重要性已获广泛认可,但通过此方法获取数据的降水不确定性估计却鲜有提供。此外,机器学习在提供此类估计方面的潜力尚未通过基准实验得到充分认识与恰当探索。本研究旨在通过首次开展该主题的基准测试来填补这一空白。基于覆盖美国本土的15年月度大尺度数据集,我们系统比较了六种结构上适用于预测不确定性量化的学习器:分位数回归(QR)、分位数回归森林(QRF)、广义随机森林(GRF)、梯度提升机(GBM)、轻量梯度提升机(LightGBM)和分位数回归神经网络(QRNN)。比较聚焦于学习器在九个分位水平上输出预测分位数的能力——这些水平可有效近似整个预测概率分布,并主要依据分位数得分和连续排序概率技巧得分进行评估。研究使用了三类预测变量(卫星降水变量、目标点与卫星网格点间距、目标点海拔)进行比较,并基于可解释机器学习中的特征重要性概念展开交叉对比。结果表明,针对该任务的学习器性能排序为:LightGBM > QRF > GRF > GBM > QRNN > QR。