In a context of constant increase in competition and heightened regulatory pressure, accuracy, actuarial precision, as well as transparency and understanding of the tariff, are key issues in non-life insurance. Traditionally used generalized linear models (GLM) result in a multiplicative tariff that favors interpretability. With the rapid development of machine learning and deep learning techniques, actuaries and the rest of the insurance industry have adopted these techniques widely. However, there is a need to associate them with interpretability techniques. In this paper, our study focuses on introducing an Explainable Boosting Machine (EBM) model that combines intrinsically interpretable characteristics and high prediction performance. This approach is described as a glass-box model and relies on the use of a Generalized Additive Model (GAM) and a cyclic gradient boosting algorithm. It accounts for univariate and pairwise interaction effects between features and provides naturally explanations on them. We implement this approach on car insurance frequency and severity data and extensively compare the performance of this approach with classical competitors: a GLM, a GAM, a CART model and an Extreme Gradient Boosting (XGB) algorithm. Finally, we examine the interpretability of these models to capture the main determinants of claim costs.
翻译:在竞争持续加剧、监管压力不断增强的背景下,非寿险领域对精算准确性、费率厘定的精确性以及费率表的透明度与可理解性提出了更高要求。传统使用的广义线性模型(GLM)通过乘法结构构建费率表,其优势在于良好的可解释性。随着机器学习和深度学习技术的快速发展,精算师及保险业其他从业者已广泛采用这些技术。然而,如何将这些技术与可解释性方法相结合成为亟待解决的问题。本文研究重点在于引入一种兼具内在可解释特性与高预测性能的可解释增强机(EBM)模型。该方法被描述为一种"玻璃盒"模型,其基础是广义可加模型(GAM)与循环梯度增强算法的结合。该模型能够处理特征间的单变量效应及双变量交互效应,并提供自然的解释机制。我们将此方法应用于汽车保险索赔频率与严重性数据,并系统比较了该方法与经典模型的性能表现:包括GLM、GAM、CART模型以及极限梯度增强(XGB)算法。最后,我们通过考察这些模型的可解释性来捕捉影响索赔成本的主要决定因素。