When faced with a new customer, many factors contribute to an insurance firm's decision of what offer to make to that customer. In addition to the expected cost of providing the insurance, the firm must consider the other offers likely to be made to the customer, and how sensitive the customer is to differences in price. Moreover, firms often target a specific portfolio of customers that could depend on, e.g., age, location, and occupation. Given such a target portfolio, firms may choose to modulate an individual customer's offer based on whether the firm desires the customer within their portfolio. We term the problem of modulating offers to achieve a desired target portfolio the portfolio pursuit problem. Having formulated the portfolio pursuit problem as a sequential decision making problem, we devise a novel reinforcement learning algorithm for its solution. We test our method on a complex synthetic market environment, and demonstrate that it outperforms a baseline method which mimics current industry approaches to portfolio pursuit.
翻译:当面对新客户时,保险公司的报价决策受多重因素影响。除了提供保险的预期成本外,公司必须考虑客户可能收到的其他报价,以及客户对价格差异的敏感程度。此外,保险公司通常以特定客户组合为目标,该组合可能取决于年龄、地理位置和职业等因素。给定目标组合后,公司可根据是否希望将客户纳入其组合来调整对个体客户的报价。我们将这种通过调整报价以实现目标组合的问题称为组合追求问题。在将组合追求问题形式化为序列决策问题后,我们设计了一种新颖的强化学习算法以求解该问题。我们在复杂的合成市场环境中测试了所提出的方法,并证明其性能优于模拟当前行业组合追求方法的基准算法。