This research article analyses and demonstrates the hidden implications for fairness of seemingly neutral data coupled with powerful technology, such as machine learning (ML), using Open Banking as an example. Open Banking has ignited a revolution in financial services, opening new opportunities for customer acquisition, management, retention, and risk assessment. However, the granularity of transaction data holds potential for harm where unnoticed proxies for sensitive and prohibited characteristics may lead to indirect discrimination. Against this backdrop, we investigate the dimensions of financial vulnerability (FV), a global concern resulting from COVID-19 and rising inflation. Specifically, we look to understand the behavioral elements leading up to FV and its impact on at-risk, disadvantaged groups through the lens of fair interpretation. Using a unique dataset from a UK FinTech lender, we demonstrate the power of fine-grained transaction data while simultaneously cautioning its safe usage. Three ML classifiers are compared in predicting the likelihood of FV, and groups exhibiting different magnitudes and forms of FV are identified via clustering to highlight the effects of feature combination. Our results indicate that engineered features of financial behavior can be predictive of omitted personal information, particularly sensitive or protected characteristics, shedding light on the hidden dangers of Open Banking data. We discuss the implications and conclude fairness via unawareness is ineffective in this new technological environment.
翻译:本研究以开放银行为例,分析并论证了看似中性的数据与强大技术(如机器学习)相结合对公平性的潜在影响。开放银行引发了金融服务领域的革命,为客户获取、管理、保留及风险评估开辟了新机遇。然而,交易数据的细粒度可能带来危害,其中未被察觉的敏感及受禁止特征的替代指标可能导致间接歧视。在此背景下,我们探究了金融脆弱性(FV)的维度——这是新冠疫情及通胀上升所引发的全球性问题。具体而言,我们试图通过公平解读的视角,理解导致FV的行为要素及其对高风险弱势群体的影响。基于英国一家金融科技贷款机构的独特数据集,我们展示了细粒度交易数据的强大功能,同时警示其安全使用。我们比较了三种机器学习分类器在预测FV可能性方面的表现,并通过聚类识别出呈现不同FV程度与形式的群体,以凸显特征组合的影响。结果表明,金融行为的人造特征可预测被省略的个人信息,尤其是敏感或受保护特征,揭示了开放银行数据的潜在危险。我们讨论了相关影响,并得出结论:在该新技术环境下,通过忽视来实现公平的做法是无效的。