Federated learning is a prominent distributed learning paradigm that incorporates collaboration among diverse clients, promotes data locality, and thus ensures privacy. These clients have their own technological, cultural, and other biases in the process of data generation. However, the present standard often ignores this bias/heterogeneity, perpetuating bias against certain groups rather than mitigating it. In response to this concern, we propose an equitable clustering-based framework where the clients are categorized/clustered based on how similar they are to each other. We propose a unique way to construct the similarity matrix that uses activation vectors. Furthermore, we propose a client weighing mechanism to ensure that each cluster receives equal importance and establish $O(1/\sqrt{K})$ rate of convergence to reach an $\epsilon-$stationary solution. We assess the effectiveness of our proposed strategy against common baselines, demonstrating its efficacy in terms of reducing the bias existing amongst various client clusters and consequently ameliorating algorithmic bias against specific groups.
翻译:联邦学习作为一种重要的分布式学习范式,通过促进多方客户端协作并保持数据本地化,从而确保隐私保护。这些客户端在数据生成过程中存在各自的技术、文化及其他方面的偏差。然而,当前的标准方法往往忽视这种偏差/异质性,不仅未能缓解偏差,反而延续了对某些群体的偏见。针对这一问题,我们提出了一种基于聚类的公平框架,根据客户端之间的相似性对其进行分类/聚类。我们提出了一种利用激活向量构建相似度矩阵的创新方法。此外,我们设计了一种客户端加权机制,以确保每个聚类获得同等重要性,并建立了达到$\epsilon-$平稳解的$O(1/\sqrt{K})$收敛速率。我们通过对比常见基线方法评估了所提策略的有效性,证明了其在减少不同客户端聚类间现有偏差、进而缓解针对特定群体的算法偏见方面的显著效果。