In lending, where prices are specific to both customers and products, having a well-functioning personalized pricing policy in place is essential to effective business making. Typically, such a policy must be derived from observational data, which introduces several challenges. While the problem of ``endogeneity'' is prominently studied in the established pricing literature, the problem of selection bias (or, more precisely, bid selection bias) is not. We take a step towards understanding the effects of selection bias by posing pricing as a problem of causal inference. Specifically, we consider the reaction of a customer to price a treatment effect. In our experiments, we simulate varying levels of selection bias on a semi-synthetic dataset on mortgage loan applications in Belgium. We investigate the potential of parametric and nonparametric methods for the identification of individual bid-response functions. Our results illustrate how conventional methods such as logistic regression and neural networks suffer adversely from selection bias. In contrast, we implement state-of-the-art methods from causal machine learning and show their capability to overcome selection bias in pricing data.
翻译:在贷款业务中,由于价格同时因客户和产品而异,制定运行良好的个性化定价政策对于有效商业决策至关重要。通常,此类政策必须基于观测数据推导,这带来了一系列挑战。尽管现有定价文献已深入研究了“内生性”问题,但选择偏差(更准确地说,投标选择偏差)问题尚未得到充分关注。我们通过将定价问题视为因果推断问题,迈出了理解选择偏差影响的一步。具体而言,我们将客户对价格的反应视为处理效应。在实验中,我们基于比利时抵押贷款申请的半合成数据集模拟了不同水平的选择偏差,并探究了参数化与非参数化方法在识别个体投标-响应函数方面的潜力。结果表明,逻辑回归和神经网络等传统方法会因选择偏差而严重受损。相比之下,我们采用了因果机器学习领域的最新技术,并展示了其在克服定价数据中的选择偏差方面的能力。