Quantile regression is a fundamental problem in statistical learning motivated by a need to quantify uncertainty in predictions, or to model a diverse population without being overly reductive. For instance, epidemiological forecasts, cost estimates, and revenue predictions all benefit from being able to quantify the range of possible values accurately. As such, many models have been developed for this problem over many years of research in statistics, machine learning, and related fields. Rather than proposing yet another (new) algorithm for quantile regression we adopt a meta viewpoint: we investigate methods for aggregating any number of conditional quantile models, in order to improve accuracy and robustness. We consider weighted ensembles where weights may vary over not only individual models, but also over quantile levels, and feature values. All of the models we consider in this paper can be fit using modern deep learning toolkits, and hence are widely accessible (from an implementation point of view) and scalable. To improve the accuracy of the predicted quantiles (or equivalently, prediction intervals), we develop tools for ensuring that quantiles remain monotonically ordered, and apply conformal calibration methods. These can be used without any modification of the original library of base models. We also review some basic theory surrounding quantile aggregation and related scoring rules, and contribute a few new results to this literature (for example, the fact that post sorting or post isotonic regression can only improve the weighted interval score). Finally, we provide an extensive suite of empirical comparisons across 34 data sets from two different benchmark repositories.
翻译:分位数回归是统计学习中的一个基本问题,其动机源于量化预测不确定性或对多样化群体进行建模(避免过度简化)的需求。例如,流行病学预测、成本估算和收入预测均受益于能够准确量化可能值的范围。因此,经过统计学、机器学习及相关领域的多年研究,已针对该问题开发出众多模型。本文并非提出又一个(新的)分位数回归算法,而是采用元视角:研究聚合任意数量的条件分位数模型的方法,以提升精度和鲁棒性。我们考虑加权集成方法,其中权重不仅可以在单个模型间变化,还可以随分位数水平和特征值变化。本文考虑的所有模型均可使用现代深度学习工具包进行拟合,因此(从实现角度)具有广泛可及性和可扩展性。为提高预测分位数(等价于预测区间)的精度,我们开发了确保分位数单调有序的工具,并应用保形校准方法。这些方法无需修改原始基础模型库即可使用。我们还回顾了围绕分位数聚合及相关评分规则的基础理论,并为该文献贡献了一些新结果(例如,排序后或保序回归后处理仅能改善加权区间评分)。最后,我们在来自两个不同基准库的34个数据集上进行了广泛的实证比较。