In this paper we consider online distributed learning problems. Online distributed learning refers to the process of training learning models on distributed data sources. In our setting a set of agents need to cooperatively train a learning model from streaming data. Differently from federated learning, the proposed approach does not rely on a central server but only on peer-to-peer communications among the agents. This approach is often used in scenarios where data cannot be moved to a centralized location due to privacy, security, or cost reasons. In order to overcome the absence of a central server, we propose a distributed algorithm that relies on a quantized, finite-time coordination protocol to aggregate the locally trained models. Furthermore, our algorithm allows for the use of stochastic gradients during local training. Stochastic gradients are computed using a randomly sampled subset of the local training data, which makes the proposed algorithm more efficient and scalable than traditional gradient descent. In our paper, we analyze the performance of the proposed algorithm in terms of the mean distance from the online solution. Finally, we present numerical results for a logistic regression task.
翻译:本文研究了在线分布式学习问题。在线分布式学习指在分布式数据源上训练学习模型的过程。在我们的设定中,一组智能体需要协作地从流式数据中训练一个学习模型。与联邦学习不同,所提出的方法不依赖于中央服务器,仅依赖智能体间的点对点通信。由于隐私、安全或成本等原因,数据无法迁移到集中位置时,常采用此类方法。为克服无中央服务器的限制,我们提出一种分布式算法,该算法依赖量化有限时间协调协议来聚合本地训练的模型。此外,我们的算法允许在本地训练过程中使用随机梯度。随机梯度通过随机采样子集本地训练数据计算得到,这使得所提算法比传统梯度下降更高效且更具可扩展性。本文从在线解的平均距离角度分析了所提算法的性能。最后,我们针对逻辑回归任务给出了数值结果。