The exponential growth in model sizes has significantly increased the communication burden in Federated Learning (FL). Existing methods to alleviate this burden by transmitting compressed gradients often face high compression errors, which slow down the model's convergence. To simultaneously achieve high compression effectiveness and lower compression errors, we study the gradient compression problem from a novel perspective. Specifically, we propose a systematical algorithm termed Extended Single-Step Synthetic Features Compressing (E-3SFC), which consists of three sub-components, i.e., the Single-Step Synthetic Features Compressor (3SFC), a double-way compression algorithm, and a communication budget scheduler. First, we regard the process of gradient computation of a model as decompressing gradients from corresponding inputs, while the inverse process is considered as compressing the gradients. Based on this, we introduce a novel gradient compression method termed 3SFC, which utilizes the model itself as a decompressor, leveraging training priors such as model weights and objective functions. 3SFC compresses raw gradients into tiny synthetic features in a single-step simulation, incorporating error feedback to minimize overall compression errors. To further reduce communication overhead, 3SFC is extended to E-3SFC, allowing double-way compression and dynamic communication budget scheduling. Our theoretical analysis under both strongly convex and non-convex conditions demonstrates that 3SFC achieves linear and sub-linear convergence rates with aggregation noise. Extensive experiments across six datasets and six models reveal that 3SFC outperforms state-of-the-art methods by up to 13.4% while reducing communication costs by 111.6 times. These findings suggest that 3SFC can significantly enhance communication efficiency in FL without compromising model performance.
翻译:模型规模的指数级增长显著增加了联邦学习中的通信负担。现有通过传输压缩梯度来减轻该负担的方法通常面临高压缩误差,从而减缓模型收敛速度。为同时实现高压缩效率和低压缩误差,本文从一个新颖的视角研究梯度压缩问题。具体而言,我们提出一种系统性算法——扩展单步合成特征压缩,该算法包含三个子组件:单步合成特征压缩器、双向压缩算法和通信预算调度器。首先,我们将模型梯度计算过程视为从相应输入解压缩梯度的过程,而其逆过程则被视为压缩梯度的过程。基于此,我们引入一种新颖的梯度压缩方法,该方法利用模型自身作为解压缩器,并借助模型权重和目标函数等训练先验信息。通过单步模拟将原始梯度压缩为微小的合成特征,并引入误差反馈以最小化整体压缩误差。为进一步降低通信开销,该方法被扩展为支持双向压缩和动态通信预算调度的算法。我们在强凸和非凸条件下的理论分析表明,该方法在存在聚合噪声时实现了线性和次线性收敛速率。在六个数据集和六个模型上的大量实验表明,该方法在降低通信成本111.6倍的同时,性能优于现有最优方法达13.4%。这些发现表明,该方法能在不损害模型性能的前提下,显著提升联邦学习的通信效率。