The rapid growth in the parameter scale of large language models (LLMs) has created a high demand for efficient compression techniques. As a hardware-agnostic and highly compatible technique, low-rank compression has been widely adopted. However, existing methods typically compress each layer independently by minimizing per-layer reconstruction error, overlooking a critical limitation: the reconstruction error propagates and accumulates through the network, which leads to amplified global deviations from the full-precision baseline. To address this, we propose Self-Adaptive Error Suppression SVD (SAES-SVD), a LLMs compression framework that jointly optimizes intra-layer reconstruction and inter-layer error compensation. SAES-SVD is composed of two novel components: (1) Cumulative Error-Aware Layer Compression (CEALC), which formulates the compression objective as a combination of local reconstruction and weighted cumulative error compensation. Based on it, we derive a closed-form low-rank solution relied on second-order activation statistics, which explicitly aligns each layer's output with its full-precision counterpart to compensate for accumulated errors. (2) Adaptive Collaborative Error Suppression (ACES), which automatically adjusts the weighting coefficient to enhance the low-rank structure of the compression objective in CEALC. Specifically, the coefficient is optimized to maximize the ratio between the Frobenius norm of the compressed layer's output and that of the compression objective under a fixed rank, thus ensuring that the rank budget is utilized effectively. Extensive experiments across multiple LLM architectures and tasks show that, without fine-tuning or mixed-rank strategies, SAES-SVD consistently improves post-compression performance.
翻译:大语言模型参数规模的快速增长催生了对高效压缩技术的高需求。作为一种与硬件无关且兼容性强的技术,低秩压缩已被广泛采用。然而,现有方法通常通过最小化逐层重构误差来独立压缩每一层,忽视了一个关键限制:重构误差会在网络中传播和累积,从而导致与全精度基线相比的全局偏差被放大。为解决此问题,我们提出了自适应误差抑制SVD(SAES-SVD),这是一个联合优化层内重构与层间误差补偿的大语言模型压缩框架。SAES-SVD包含两个新颖的组件:(1)累积误差感知层压缩(CEALC),它将压缩目标公式化为局部重构与加权累积误差补偿的组合。基于此,我们推导出一个依赖于二阶激活统计量的闭式低秩解,该解显式地将每层输出与其全精度对应层对齐,以补偿累积误差。(2)自适应协作误差抑制(ACES),它自动调整加权系数以增强CEALC中压缩目标的低秩结构。具体而言,该系数被优化以最大化在固定秩下压缩层输出的Frobenius范数与压缩目标Frobenius范数之比,从而确保秩预算得到有效利用。在多种大语言模型架构和任务上的大量实验表明,无需微调或混合秩策略,SAES-SVD能持续提升压缩后的性能。