Sketching is one of the most fundamental tools in large-scale machine learning. It enables runtime and memory saving via randomly compressing the original large problem into lower dimensions. In this paper, we propose a novel sketching scheme for the first order method in large-scale distributed learning setting, such that the communication costs between distributed agents are saved while the convergence of the algorithms is still guaranteed. Given gradient information in a high dimension $d$, the agent passes the compressed information processed by a sketching matrix $R\in \mathbb{R}^{s\times d}$ with $s\ll d$, and the receiver de-compressed via the de-sketching matrix $R^\top$ to ``recover'' the information in original dimension. Using such a framework, we develop algorithms for federated learning with lower communication costs. However, such random sketching does not protect the privacy of local data directly. We show that the gradient leakage problem still exists after applying the sketching technique by presenting a specific gradient attack method. As a remedy, we prove rigorously that the algorithm will be differentially private by adding additional random noises in gradient information, which results in a both communication-efficient and differentially private first order approach for federated learning tasks. Our sketching scheme can be further generalized to other learning settings and might be of independent interest itself.
翻译:草图法是大规模机器学习中最基础的工具之一。它通过将原始大规模问题随机压缩为低维表示,从而实现运行时间与内存的节省。本文针对大规模分布式学习场景中的一阶方法提出了一种新型草图方案,该方案在降低分布式智能体间通信开销的同时,仍能保证算法收敛性。给定高维$d$中的梯度信息,智能体通过草图矩阵$R\in \mathbb{R}^{s\times d}$(其中$s\ll d$)处理压缩信息,接收方则利用逆草图矩阵$R^\top$反压缩以"恢复"原始维度信息。基于该框架,我们开发了具有较低通信成本的联邦学习算法。然而,此类随机草图无法直接保护本地数据隐私。通过提出特定梯度攻击方法,我们证明应用草图技术后梯度泄露问题依然存在。作为补救措施,我们严格证明在梯度信息中添加额外随机噪声后,算法可实现差分隐私,从而形成一种兼具通信高效性与差分隐私保护的一阶方法,适用于联邦学习任务。所提出的草图方案可进一步推广至其他学习场景,并可能具有独立的研究价值。