The paradigm of vertical federated learning (VFL), where institutions collaboratively train machine learning models via combining each other's local feature or label information, has achieved great success in applications to financial risk management (FRM). The surging developments of graph representation learning (GRL) have opened up new opportunities for FRM applications under FL via efficiently utilizing the graph-structured data generated from underlying transaction networks. Meanwhile, transaction information is often considered highly sensitive. To prevent data leakage during training, it is critical to develop FL protocols with formal privacy guarantees. In this paper, we present an end-to-end GRL framework in the VFL setting called VESPER, which is built upon a general privatization scheme termed perturbed message passing (PMP) that allows the privatization of many popular graph neural architectures.Based on PMP, we discuss the strengths and weaknesses of specific design choices of concrete graph neural architectures and provide solutions and improvements for both dense and sparse graphs. Extensive empirical evaluations over both public datasets and an industry dataset demonstrate that VESPER is capable of training high-performance GNN models over both sparse and dense graphs under reasonable privacy budgets.
翻译:垂直联邦学习(VFL)范式允许各机构通过整合彼此的局部特征或标签信息来协同训练机器学习模型,已在金融风险管理(FRM)应用中取得巨大成功。图表示学习(GRL)的蓬勃发展通过高效利用底层交易网络生成的图结构数据,为联邦学习(FL)下的FRM应用开辟了新机遇。然而,交易信息通常被视为高度敏感数据。为防止训练过程中的数据泄露,开发具有形式化隐私保障的FL协议至关重要。本文提出了一种端到端的VFL场景下的GRL框架VESPER,该框架基于称为扰动消息传递(PMP)的通用私有化方案构建,可对多种流行图神经架构进行私有化处理。基于PMP,我们讨论了具体图神经架构设计选择的优劣,并针对稠密图和稀疏图分别提供了解决方案与改进措施。对公开数据集和行业数据集的大量实验评估表明,VESPER能够在合理隐私预算下,在稀疏图和稠密图上训练出高性能的图神经网络模型。