Accuracy and interpretability of a (non-life) insurance pricing model are essential qualities to ensure fair and transparent premiums for policy-holders, that reflect their risk. In recent years, the classification and regression trees (CARTs) and their ensembles have gained popularity in the actuarial literature, since they offer good prediction performance and are relatively easily interpretable. In this paper, we introduce Bayesian CART models for insurance pricing, with a particular focus on claims frequency modelling. Additionally to the common Poisson and negative binomial (NB) distributions used for claims frequency, we implement Bayesian CART for the zero-inflated Poisson (ZIP) distribution to address the difficulty arising from the imbalanced insurance claims data. To this end, we introduce a general MCMC algorithm using data augmentation methods for posterior tree exploration. We also introduce the deviance information criterion (DIC) for the tree model selection. The proposed models are able to identify trees which can better classify the policy-holders into risk groups. Some simulations and real insurance data will be discussed to illustrate the applicability of these models.
翻译:(非寿险)保险定价模型的准确性与可解释性对确保保单持有人获得反映其风险的公平透明保费至关重要。近年来,分类与回归树(CART)及其集成方法因兼具良好预测性能与相对易于解释的特点,在精算文献中逐渐受到关注。本文针对保险定价问题引入贝叶斯CART模型,重点研究索赔频率建模。除常用的泊松分布与负二项(NB)分布外,我们进一步将贝叶斯CART拓展至零膨胀泊松(ZIP)分布,以应对保险索赔数据中的不平衡性挑战。为此,我们提出一种采用数据增广方法的通用MCMC算法,用于后验树的探索。同时引入偏差信息准则(DIC)进行树模型选择。所提模型能够识别更优的风险分组决策树。最后通过模拟实验与真实保险数据分析验证模型的可应用性。