Histopathological Image Classification and Vulnerability Analysis using Federated Learning

Healthcare is one of the foremost applications of machine learning (ML). Traditionally, ML models are trained by central servers, which aggregate data from various distributed devices to forecast the results for newly generated data. This is a major concern as models can access sensitive user information, which raises privacy concerns. A federated learning (FL) approach can help address this issue: A global model sends its copy to all clients who train these copies, and the clients send the updates (weights) back to it. Over time, the global model improves and becomes more accurate. Data privacy is protected during training, as it is conducted locally on the clients' devices. However, the global model is susceptible to data poisoning. We develop a privacy-preserving FL technique for a skin cancer dataset and show that the model is prone to data poisoning attacks. Ten clients train the model, but one of them intentionally introduces flipped labels as an attack. This reduces the accuracy of the global model. As the percentage of label flipping increases, there is a noticeable decrease in accuracy. We use a stochastic gradient descent optimization algorithm to find the most optimal accuracy for the model. Although FL can protect user privacy for healthcare diagnostics, it is also vulnerable to data poisoning, which must be addressed.

翻译：医疗保健是机器学习（ML）最重要的应用领域之一。传统上，ML模型由中央服务器训练，该服务器聚合来自各种分布式设备的数据以预测新生成数据的结果。这引发了一个主要问题，即模型可能访问敏感的用户信息，从而产生隐私担忧。联邦学习（FL）方法有助于解决这一问题：全局模型将其副本发送给所有客户端，这些客户端训练这些副本，并将更新（权重）发回给全局模型。随着时间的推移，全局模型不断改进并变得更加准确。由于训练在客户端设备本地进行，数据隐私在训练过程中得到保护。然而，全局模型容易受到数据投毒攻击。我们针对皮肤癌数据集开发了一种保护隐私的FL技术，并表明该模型易受数据投毒攻击。十个客户端训练该模型，但其中一个客户端故意引入标签翻转作为攻击方式。这降低了全局模型的准确率。随着标签翻转比例的增加，准确率出现显著下降。我们使用随机梯度下降优化算法为模型寻找最优准确率。尽管FL可以保护医疗诊断中的用户隐私，但它也容易受到数据投毒攻击，必须加以应对。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

UCM《机器学习导论笔记》，80页pdf CSE176 Introduction to Machine Learning

专知会员服务

32+阅读 · 2021年9月29日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日