We study the problem of communication-efficient distributed vector mean estimation, a commonly used subroutine in distributed optimization and Federated Learning (FL). Rand-$k$ sparsification is a commonly used technique to reduce communication cost, where each client sends $k < d$ of its coordinates to the server. However, Rand-$k$ is agnostic to any correlations, that might exist between clients in practical scenarios. The recently proposed Rand-$k$-Spatial estimator leverages the cross-client correlation information at the server to improve Rand-$k$'s performance. Yet, the performance of Rand-$k$-Spatial is suboptimal. We propose the Rand-Proj-Spatial estimator with a more flexible encoding-decoding procedure, which generalizes the encoding of Rand-$k$ by projecting the client vectors to a random $k$-dimensional subspace. We utilize Subsampled Randomized Hadamard Transform (SRHT) as the projection matrix and show that Rand-Proj-Spatial with SRHT outperforms Rand-$k$-Spatial, using the correlation information more efficiently. Furthermore, we propose an approach to incorporate varying degrees of correlation and suggest a practical variant of Rand-Proj-Spatial when the correlation information is not available to the server. Experiments on real-world distributed optimization tasks showcase the superior performance of Rand-Proj-Spatial compared to Rand-$k$-Spatial and other more sophisticated sparsification techniques.
翻译:我们研究了通信高效的分布式向量均值估计问题,这是分布式优化和联邦学习(FL)中常用的子程序。Rand-$k$稀疏化是一种降低通信成本的常用技术,每个客户端将其$k < d$个坐标发送给服务器。然而,Rand-$k$对实际场景中客户端之间可能存在的任何相关性都不敏感。最近提出的Rand-$k$-Spatial估计器利用服务器端的跨客户端相关信息来提升Rand-$k$的性能,但该估计器的性能仍非最优。我们提出了Rand-Proj-Spatial估计器,采用更灵活的编解码流程,通过将客户端向量投影到随机$k$维子空间来泛化Rand-$k$的编码方式。我们使用子采样随机哈达玛变换(SRHT)作为投影矩阵,并证明采用SRHT的Rand-Proj-Spatial能更高效地利用相关性信息,从而优于Rand-$k$-Spatial。此外,我们提出了一种融入不同程度相关性的方法,并针对服务器端无法获取相关性信息的情况,给出了Rand-Proj-Spatial的实用变体。在真实分布式优化任务上的实验表明,Rand-Proj-Spatial相比Rand-$k$-Spatial及其他更复杂的稀疏化技术具有更优越的性能。