Fraud-related financial losses continue to rise, while regulatory, privacy, and data-sovereignty constraints increasingly limit the feasibility of centralized fraud detection systems. Federated Learning (FL) has emerged as a promising paradigm for enabling collaborative model training across institutions without sharing raw transaction data. Yet, its practical effectiveness under realistic, non-IID financial data distributions remains insufficiently validated. In this work, we present a multi-institution, industry-oriented proof-of-concept study evaluating federated anomaly detection for payment transactions using the NVIDIA FLARE framework. We simulate a realistic federation of heterogeneous financial institutions, each observing distinct fraud typologies and operating under strict data isolation. Using a deep neural network trained via federated averaging (FedAvg), we demonstrate that federated models achieve a mean F1-score of 0.903 - substantially outperforming locally trained models (0.643) and closely approaching centralized training performance (0.925), while preserving full data sovereignty. We further analyze convergence behavior, showing that strong performance is achieved within 10 federated communication rounds, highlighting the operational viability of FL in latency- and cost-sensitive financial environments. To support deployment in regulated settings, we evaluate model interpretability using Shapley-based feature attribution and confirm that federated models rely on semantically coherent, domain-relevant decision signals. Finally, we incorporate sample-level differential privacy via DP-SGD and demonstrate favorable privacy-utility trade-offs...
翻译:欺诈相关财务损失持续上升,而监管、隐私与数据主权约束日益限制集中式欺诈检测系统的可行性。联邦学习(FL)作为一种新兴范式,能够在无需共享原始交易数据的情况下实现跨机构协同模型训练。然而,其在真实非独立同分布金融数据场景下的实际有效性仍未得到充分验证。本研究提出一项面向产业的多机构概念验证,利用NVIDIA FLARE框架评估支付交易的联邦异常检测方案。我们模拟由异构金融机构组成的真实联邦环境,各机构观测不同的欺诈类型并遵循严格的数据隔离策略。通过基于联邦平均算法(FedAvg)训练的深度神经网络,我们证明联邦模型实现了0.903的平均F1分数——显著优于本地训练模型(0.643),并接近集中式训练性能(0.925),同时完全保持数据主权。我们进一步分析收敛行为,表明联邦模型在10轮通信内即可达到优异性能,凸显了联邦学习在时延与成本敏感的金融环境中的运营可行性。为支持受监管场景的部署,我们采用基于Shapley值的特征归因方法评估模型可解释性,证实联邦模型依赖语义连贯且与领域相关的决策特征。最后,我们通过DP-SGD算法引入样本级差分隐私机制,并论证了优越的隐私-效用平衡特性...