The Web is naturally heterogeneous with user devices, geographic regions, browsing patterns, and contexts all leading to highly diverse, unique datasets. Federated Learning (FL) is an important paradigm for the Web because it enables privacy-preserving, collaborative machine learning across diverse user devices, web services and clients without needing to centralize sensitive data. However, its performance degrades severely under non-IID client distributions that is prevalent in real-world web systems. In this work, we propose a new training paradigm - Iterative Federated Adaptation (IFA) - that enhances generalization in heterogeneous federated settings through generation-wise forget and evolve strategy. Specifically, we divide training into multiple generations and, at the end of each, select a fraction of model parameters (a) randomly or (b) from the later layers of the model and reinitialize them. This iterative forget and evolve schedule allows the model to escape local minima and preserve globally relevant representations. Extensive experiments on CIFAR-10, MIT-Indoors, and Stanford Dogs datasets show that the proposed approach improves global accuracy, especially when the data cross clients are Non-IID. This method can be implemented on top any federated algorithm to improve its generalization performance. We observe an average of 21.5%improvement across datasets. This work advances the vision of scalable, privacy-preserving intelligence for real-world heterogeneous and distributed web systems.
翻译:网络天然具有异构性,用户设备、地理区域、浏览模式及上下文环境均导致高度多样化、独特的数据集。联邦学习(FL)是网络领域的重要范式,因其能够在无需集中敏感数据的前提下,实现跨异构用户设备、网络服务与客户端的隐私保护协同机器学习。然而,在实际网络系统中普遍存在的非独立同分布客户端数据下,其性能会严重下降。本研究提出一种新的训练范式——迭代联邦适应(IFA),通过基于训练代次的遗忘与进化策略增强异构联邦场景中的泛化能力。具体而言,我们将训练划分为多个代次,并在每代结束时选择部分模型参数(a)随机或(b)从模型深层进行重初始化。这种迭代的遗忘与进化机制使模型能够逃离局部最优解并保留全局相关表征。在CIFAR-10、MIT-Indoors和Stanford Dogs数据集上的大量实验表明,所提方法显著提升了全局准确率,尤其在跨客户端数据呈非独立同分布时效果更为突出。该方法可基于任意联邦算法实现以提升其泛化性能,我们在各数据集上观测到平均21.5%的性能提升。本工作推动了面向现实世界异构分布式网络系统的可扩展隐私保护智能愿景的发展。