Traditional insurance pricing relies on risk-based principles that ensure actuarial fairness and solvency but do not explicitly account for policyholders' price sensitivity. We formulate insurance pricing as a decision-making problem and study it using tools from off-policy evaluation and stochastic control. We propose a kernelized inverse propensity score estimator that exploits local structure in the action space and yields variance reduction compared to the classical inverse propensity score estimator. Building on these value estimates, we investigate policy optimization and present two practical approaches for computing optimal pricing rules: an interpretable data-shared Lasso formulation and a flexible policy parameterization based on neural networks. Using a controlled synthetic travel insurance environment, we empirically confirm the theoretical results and show that neural networks outperform existing techniques for policy optimization.
翻译:传统保险定价依赖于基于风险的原则,这确保了精算公平性和偿付能力,但未能明确考虑投保人的价格敏感度。本文将保险定价建模为决策问题,并利用离策略评估和随机控制工具进行研究。我们提出一种基于核函数的逆倾向得分估计器,该估计器利用动作空间中的局部结构,相较于经典逆倾向得分估计器能够实现方差缩减。基于这些价值估计,我们探索策略优化,并提出两种计算最优定价规则的实用方法:一种可解释的数据共享Lasso公式,以及一种基于神经网络的灵活策略参数化。通过受控的合成旅行保险环境,我们实证验证了理论结果,并表明神经网络在策略优化方面优于现有技术。