Providing personalized recommendations for insurance products is particularly challenging due to the intrinsic and distinctive features of the insurance domain. First, unlike more traditional domains like retail, movie etc., a large amount of user feedback is not available and the item catalog is smaller. Second, due to the higher complexity of products, the majority of users still prefer to complete their purchases over the phone instead of online. We present different recommender models to address such data scarcity in the insurance domain. We use recurrent neural networks with 3 different types of loss functions and architectures (cross-entropy, censored Weibull, attention). Our models cope with data scarcity by learning from multiple sessions and different types of user actions. Moreover, differently from previous session-based models, our models learn to predict a target action that does not happen within the session. Our models outperform state-of-the-art baselines on a real-world insurance dataset, with ca. 44K users, 16 items, 54K purchases and 117K sessions. Moreover, combining our models with demographic data boosts the performance. Analysis shows that considering multiple sessions and several types of actions are both beneficial for the models, and that our models are not unfair with respect to age, gender and income.
翻译:在保险领域提供个性化产品推荐尤为困难,因其具有内在独特性。首先,与零售、电影等传统领域不同,保险领域缺乏大量用户反馈,且商品目录规模较小。其次,由于产品复杂度较高,多数用户仍偏好通过电话而非在线完成购买。针对保险领域的数据稀疏性问题,我们提出了多种推荐模型:采用三种不同损失函数与架构(交叉熵、截断威布尔分布、注意力机制)的循环神经网络。这些模型通过多会话学习与多类型用户行为应对数据匮乏,且区别于以往基于会话的模型,能预测会话外发生的目标行为。在包含约4.4万用户、16种商品、5.4万次购买及11.7万次会话的真实保险数据集上,我们的模型性能优于现有基线模型。进一步结合人口统计特征可提升模型表现。分析表明,多会话与多类型行为对模型均有裨益,且模型在年龄、性别和收入维度上未表现出不公平性。