Accuracy and interpretability of a (non-life) insurance pricing model are essential qualities to ensure fair and transparent premiums for policy-holders, that reflect their risk. In recent years, the classification and regression trees (CARTs) and their ensembles have gained popularity in the actuarial literature, since they offer good prediction performance and are relatively easily interpretable. In this paper, we introduce Bayesian CART models for insurance pricing, with a particular focus on claims frequency modelling. Additionally to the common Poisson and negative binomial (NB) distributions used for claims frequency, we implement Bayesian CART for the zero-inflated Poisson (ZIP) distribution to address the difficulty arising from the imbalanced insurance claims data. To this end, we introduce a general MCMC algorithm using data augmentation methods for posterior tree exploration. We also introduce the deviance information criterion (DIC) for the tree model selection. The proposed models are able to identify trees which can better classify the policy-holders into risk groups. Some simulations and real insurance data will be discussed to illustrate the applicability of these models.
翻译:(非寿险)保险定价模型的准确性和可解释性是确保投保人获得反映其风险的公平透明保费的关键属性。近年来,分类与回归树(CART)及其集成方法在精算文献中广受欢迎,因其兼具良好的预测性能与相对易于解释的特点。本文引入针对保险定价的贝叶斯CART模型,重点研究索赔频率建模。除常用的泊松分布与负二项分布(NB)外,我们针对零膨胀泊松分布(ZIP)实现贝叶斯CART,以解决非平衡保险索赔数据带来的难题。为此,我们提出一种利用数据增广方法进行后验树探索的通用MCMC算法,并引入偏差信息准则(DIC)用于树模型选择。所提出模型能够识别出更好地将投保人划分为风险组的决策树。通过模拟实验与实际保险数据讨论,验证了这些模型的适用性。