Federated learning (FL), as a decentralized machine learning solution to the protection of users' private data, has become an important learning paradigm in recent years, especially since the enforcement of stricter laws and regulations in most countries. Therefore, a variety of FL frameworks are released to facilitate the development and application of federated learning. Despite the considerable amount of research on the security and privacy of FL models and systems, the security issues in FL frameworks have not been systematically studied yet. In this paper, we conduct the first empirical study on 1,112 FL framework bugs to investigate their characteristics. These bugs are manually collected, classified, and labeled from 12 open-source FL frameworks on GitHub. In detail, we construct taxonomies of 15 symptoms, 12 root causes, and 20 fix patterns of these bugs and investigate their correlations and distributions on 23 logical components and two main application scenarios. From the results of our study, we present nine findings, discuss their implications, and propound several suggestions to FL framework developers and security researchers on the FL frameworks.
翻译:联邦学习(FL)作为一种保护用户隐私数据的去中心化机器学习解决方案,近年来已成为重要的学习范式,尤其是在多数国家实施更严格法律法规之后。因此,多种FL框架被发布以促进联邦学习的开发与应用。尽管已有大量关于FL模型和系统安全性与隐私性的研究,但FL框架中的安全问题尚未得到系统性的探究。本文首次对1,112个FL框架错误进行实证研究,以分析其特征。这些错误从GitHub上的12个开源FL框架中手动收集、分类并标注。具体而言,我们构建了这些错误的15种症状、12种根本原因和20种修复模式的分类体系,并研究了它们在23个逻辑组件和两个主要应用场景上的相关性与分布。基于研究结果,我们提出了九项发现,讨论了其启示,并为FL框架开发者及安全研究人员提出了若干建议。