Advancement in the field of machine learning is unavoidable, but something of major concern is preserving the privacy of the users whose data is being used for training these machine learning algorithms. Federated learning(FL) has emerged as a promising paradigm for training machine learning models in a distributed and privacy-preserving manner which enables one to collaborate and train a global model without sharing local data. But starting this learning process on each device in the right way, called ``model initialization" is critical. The choice of initialization methods used for models plays a crucial role in the performance, convergence speed, communication efficiency, privacy guarantees of federated learning systems, etc. In this survey, we dive deeper into a comprehensive study of various ways of model initialization techniques in FL.Unlike other studies, our research meticulously compares, categorizes, and delineates the merits and demerits of each technique, examining their applicability across diverse FL scenarios. We highlight how factors like client variability, data non-IIDness, model caliber, security considerations, and network restrictions influence FL model outcomes and propose how strategic initialization can address and potentially rectify many such challenges. The motivation behind this survey is to highlight that the right start can help overcome challenges like varying data quality, security issues, and network problems. Our insights provide a foundational base for experts looking to fully utilize FL, also while understanding the complexities of model initialization.
翻译:机器学习领域的进步不可避免,但其中一项主要关注点是如何保护用于训练这些算法的用户数据隐私。联邦学习(FL)作为一种有前景的范式出现,它能够以分布式且保护隐私的方式训练机器学习模型,使各方无需共享本地数据即可协作训练一个全局模型。然而,在每个设备上以正确的方式启动这一学习过程——即“模型初始化”——至关重要。模型所采用的初始化方法选择,在联邦学习系统的性能、收敛速度、通信效率、隐私保障等方面均起着关键作用。在本综述中,我们深入研究了FL中多种模型初始化技术的综合方法。与其他研究不同,我们的工作细致地比较、分类并阐述了每种技术的优缺点,考察了它们在多样化FL场景中的适用性。我们强调了客户端差异性、数据非独立同分布性、模型质量、安全考量及网络限制等因素如何影响FL模型的结果,并提出策略性初始化如何能够应对并可能修正诸多此类挑战。本综述的动机在于强调:正确的起点有助于克服诸如数据质量参差不齐、安全问题及网络问题等挑战。我们的见解为希望充分利用FL、同时理解模型初始化复杂性的专家奠定了坚实基础。