Prediction uncertainty quantification is a key research topic in recent years scientific and business problems. In insurance industries (\cite{parodi2023pricing}), assessing the range of possible claim costs for individual drivers improves premium pricing accuracy. It also enables insurers to manage risk more effectively by accounting for uncertainty in accident likelihood and severity. In the presence of covariates, a variety of regression-type models are often used for modeling insurance claims, ranging from relatively simple generalized linear models (GLMs) to regularized GLMs to gradient boosting models (GBMs). Conformal predictive inference has arisen as a popular distribution-free approach for quantifying predictive uncertainty under relatively weak assumptions of exchangeability, and has been well studied under the classic linear regression setting. In this work, we propose new non-conformity measures for GLMs and GBMs with GLM-type loss. Using regularized Tweedie GLM regression and LightGBM with Tweedie loss, we demonstrate conformal prediction performance with these non-conformity measures in insurance claims data. Our simulation results favor the use of locally weighted Pearson residuals for LightGBM over other methods considered, as the resulting intervals maintained the nominal coverage with the smallest average width.
翻译:预测不确定性量化是近年来科学和商业问题中的关键研究课题。在保险行业(\cite{parodi2023pricing})中,评估个体驾驶者可能索赔成本的范围可以提高保费定价的准确性。通过考虑事故发生可能性与严重性的不确定性,它也使保险公司能够更有效地管理风险。在存在协变量的情况下,多种回归类模型常用于保险索赔建模,范围从相对简单的广义线性模型(GLMs)到正则化GLMs,再到梯度提升模型(GBMs)。保形预测推断已成为一种流行的无分布方法,用于在相对较弱的可交换性假设下量化预测不确定性,并在经典线性回归设定下得到了充分研究。在本工作中,我们针对具有GLM类损失的GLMs和GBMs提出了新的非保形性度量。通过使用正则化Tweedie GLM回归和具有Tweedie损失的LightGBM,我们在保险索赔数据中展示了采用这些非保形性度量的保形预测性能。我们的模拟结果表明,对于LightGBM,使用局部加权皮尔逊残差优于其他考虑的方法,因为由此产生的区间在保持名义覆盖水平的同时,具有最小的平均宽度。