FBFL: A Field-Based Coordination Approach for Data Heterogeneity in Federated Learning

In the last years, Federated learning (FL) has become a popular solution to train machine learning models in domains with high privacy concerns. However, FL scalability and performance face significant challenges in real-world deployments where data across devices are non-independently and identically distributed (non-IID). The heterogeneity in data distribution frequently arises from spatial distribution of devices, leading to degraded model performance in the absence of proper handling. Additionally, FL typical reliance on centralized architectures introduces bottlenecks and single-point-of-failure risks, particularly problematic at scale or in dynamic environments. To close this gap, we propose Field-Based Federated Learning (FBFL), a novel approach leveraging macroprogramming and field coordination to address these limitations through: (i) distributed spatial-based leader election for personalization to mitigate non-IID data challenges; and (ii) construction of a self-organizing, hierarchical architecture using advanced macroprogramming patterns. Moreover, FBFL not only overcomes the aforementioned limitations, but also enables the development of more specialized models tailored to the specific data distribution in each subregion. This paper formalizes FBFL and evaluates it extensively using MNIST, FashionMNIST, and Extended MNIST datasets. We demonstrate that, when operating under IID data conditions, FBFL performs comparably to the widely-used FedAvg algorithm. Furthermore, in challenging non-IID scenarios, FBFL not only outperforms FedAvg but also surpasses other state-of-the-art methods, namely FedProx and Scaffold, which have been specifically designed to address non-IID data distributions. Additionally, we showcase the resilience of FBFL's self-organizing hierarchical architecture against server failures.

翻译：近年来，联邦学习（FL）已成为在隐私敏感领域训练机器学习模型的流行解决方案。然而，在实际部署中，当设备间数据呈非独立同分布（non-IID）时，联邦学习的可扩展性和性能面临重大挑战。数据分布的异构性通常源于设备的空间分布，若缺乏适当处理将导致模型性能下降。此外，联邦学习通常依赖集中式架构，这引入了瓶颈和单点故障风险，在大规模或动态环境中尤为突出。为弥补这一不足，我们提出基于场的联邦学习（FBFL），这是一种利用宏编程和场协调的新方法，通过以下方式解决上述局限：（i）采用基于空间分布的分布式领导者选举机制实现个性化，以缓解非独立同分布数据带来的挑战；（ii）利用先进的宏编程模式构建自组织的层次化架构。此外，FBFL不仅克服了前述局限，还能针对各子区域特定的数据分布开发更专业化的模型。本文对FBFL进行了形式化定义，并使用MNIST、FashionMNIST和Extended MNIST数据集进行了广泛评估。实验表明，在独立同分布数据条件下，FBFL的性能与广泛使用的FedAvg算法相当。更重要的是，在具有挑战性的非独立同分布场景中，FBFL不仅优于FedAvg，还超越了其他专门针对非独立同分布数据设计的先进方法，即FedProx和Scaffold。此外，我们验证了FBFL自组织层次化架构在服务器故障情况下的鲁棒性。

相关内容

独立同分布

关注 0

独立同分布（iid，independently identically distribution）在概率统计理论中，指随机过程中，任何时刻的取值都为随机变量，如果这些随机变量服从同一分布，并且互相独立，那么这些随机变量是独立同分布。独立同分布最早应用于统计学，随着科学的发展，独立同分布已经应用数据挖掘，信号处理等不同的领域。

《联邦学习在网络安全中的应用：性能、鲁棒性与对抗性威胁》2025最新145页

专知会员服务

20+阅读 · 2025年9月18日

异构联邦学习在无人系统中的研究综述

专知会员服务

12+阅读 · 2025年5月25日

【CMU博士论文】异构网络可信可扩展学习，296页pdf

专知会员服务

33+阅读 · 2023年9月24日

【CMU博士论文】异构网络中可扩展且值得信赖的学习方法，147页pdf

专知会员服务

25+阅读 · 2023年8月27日