Fairness has become an important concern in insurance pricing as insurers increasingly rely on machine learning models to predict expected losses. At the same time, regulatory and privacy constraints often restrict insurers' ability to access or use sensitive attributes such as gender or race. Recent actuarial research addresses fairness in this context through the concept of the discrimination-free premium, which removes both the direct and indirect effects of sensitive attributes while preserving actuarial consistency. However, implementing this approach typically requires access to the sensitive attributes themselves, which may not be available in practice. This paper studies the estimation of discrimination-free insurance premiums when sensitive attributes are observed only in privatized or noise-perturbed form. We consider a multi-party data setting in which insurers observe non-sensitive attributes and outcomes, while a trusted third party holds privatized sensitive attributes generated through a privacy mechanism. Within this framework, we develop statistical methods for estimating discrimination-free premiums using only the privatized attributes. We study two settings of practical relevance: when the privacy mechanism is known and when its noise level is unknown. For both cases, we establish theoretical guarantees for the proposed estimators. Numerical experiments and empirical applications demonstrate that the proposed approach enables fair insurance pricing while respecting privacy and regulatory constraints.
翻译:公平性已成为保险定价中的重要关切,因为保险公司日益依赖机器学习模型预测预期损失。与此同时,监管和隐私约束往往限制保险公司获取或使用性别、种族等敏感属性的能力。近期精算研究通过"无歧视保费"概念解决了这一背景下的公平性问题:该概念在保持精算一致性的同时,既消除敏感属性的直接影响也消除其间接影响。然而,实施该方法通常需要访问敏感属性本身,这在实践中可能无法实现。本文研究了当敏感属性仅以隐私化或噪声扰动形式被观测时的无歧视保险保费估计问题。我们考虑一个多方数据场景:保险公司观测非敏感属性和结果,而受信任第三方持有通过隐私机制生成的隐私化敏感属性。在此框架下,我们开发了仅利用隐私化属性估计无歧视保费的统计方法。我们研究了两个实际相关场景:隐私机制已知及噪声水平未知的情况。针对这两种情形,我们为所提出的估计量建立了理论保证。数值实验和实证应用表明,所提方法能够在遵守隐私和监管约束的同时实现公平保险定价。