We address the problem of cluster identity estimation in a hierarchical federated learning setting in which users work toward learning different tasks. To overcome the challenge of task heterogeneity, users need to be grouped in a way such that users with the same task are in the same group, conducting training together, while sharing the weights of feature extraction layers with the other groups. Toward that end, we propose a one-shot clustering algorithm that can effectively identify and group users based on their data similarity. This enables more efficient collaboration and sharing of a common layer representation within the federated learning system. Our proposed algorithm not only enhances the clustering process, but also overcomes challenges related to privacy concerns, communication overhead, and the need for prior knowledge about learning models or loss function behaviors. We validate our proposed algorithm using various datasets such as CIFAR-10 and Fashion MNIST, and show that it outperforms the baseline in terms of accuracy and variance reduction.
翻译:我们研究了分层联邦学习环境中集群身份估计的问题,其中用户致力于学习不同的任务。为克服任务异质性带来的挑战,需要将用户进行分组,使得具有相同任务的用户处于同一组内共同进行训练,同时与其他组共享特征提取层的权重。为此,我们提出一种基于数据相似性的单次聚类算法,能够有效识别用户并将其分组。这使得联邦学习系统内能实现更高效的协作与公共层表示的共享。我们提出的算法不仅改进了聚类过程,还克服了与隐私担忧、通信开销以及对学习模型或损失函数行为先验知识需求相关的挑战。我们使用CIFAR-10和Fashion MNIST等多种数据集验证了所提算法,结果表明其在准确率提升和方差降低方面均优于基线方法。