In this study, we address the emerging field of Streaming Federated Learning (SFL) and propose local cache update rules to manage dynamic data distributions and limited cache capacity. Traditional federated learning relies on fixed data sets, whereas in SFL, data is streamed, and its distribution changes over time, leading to discrepancies between the local training dataset and long-term distribution. To mitigate this problem, we propose three local cache update rules - First-In-First-Out (FIFO), Static Ratio Selective Replacement (SRSR), and Dynamic Ratio Selective Replacement (DRSR) - that update the local cache of each client while considering the limited cache capacity. Furthermore, we derive a convergence bound for our proposed SFL algorithm as a function of the distribution discrepancy between the long-term data distribution and the client's local training dataset. We then evaluate our proposed algorithm on two datasets: a network traffic classification dataset and an image classification dataset. Our experimental results demonstrate that our proposed local cache update rules significantly reduce the distribution discrepancy and outperform the baseline methods. Our study advances the field of SFL and provides practical cache management solutions in federated learning.
翻译:本研究针对新兴的流式联邦学习(Streaming Federated Learning, SFL)领域,提出本地缓存更新规则,以应对动态数据分布与有限缓存容量的挑战。传统联邦学习依赖固定数据集,而SFL中的数据以流式形式呈现,其分布随时间变化,导致本地训练数据集与长期分布之间产生偏差。为解决这一问题,我们提出三种本地缓存更新规则——先进先出(FIFO)、静态比率选择性替换(SRSR)和动态比率选择性替换(DRSR)——在考虑有限缓存容量的同时更新各客户端的本地缓存。此外,我们推导出所提SFL算法的收敛界,该收敛界是长期数据分布与客户端本地训练数据集之间分布偏差的函数。随后,我们在网络流量分类数据集和图像分类数据集上对所提算法进行评估。实验结果表明,我们提出的本地缓存更新规则显著降低了分布偏差,且性能优于基线方法。本研究推动了SFL领域的发展,并为联邦学习提供了实用的缓存管理解决方案。