Large language models (LLMs) have driven major advances across domains, yet their massive size hinders deployment in resource-constrained settings. Model compression addresses this challenge, with low-rank factorization emerging as a particularly effective method for reducing size, memory, and computation while maintaining accuracy. However, while these compressed models boast of benign performance and system-level advantages, their trustworthiness implications remain poorly understood. In this paper, we present the first comprehensive study of how low-rank factorization affects LLM trustworthiness across privacy, adversarial robustness, fairness, and ethical alignment. We evaluate multiple LLMs of different sizes and variants compressed with diverse low-rank algorithms, revealing key insights: (1) low-rank compression preserves or improves training data privacy but weakens PII protection during conversation; (2) adversarial robustness is generally preserved and often enhanced, even under deep compression; (3) ethical reasoning degrades in zero-shot settings but partially recovers with few-shot prompting; (4) fairness declines under compression. Beyond compression, we investigate how model scale and fine-tuning affect trustworthiness, as both are important in low-rank methods. To guide trustworthy compression strategies, we end our paper with a gradient-based attribution analysis to identify which layers in LLMs contribute most to adversarial robustness.
翻译:大语言模型(LLMs)推动了各领域的重大进展,但其庞大的规模阻碍了在资源受限环境中的部署。模型压缩技术通过低秩分解等方法来应对这一挑战,该方法在保持精度的同时,能有效减小模型尺寸、内存占用和计算量。然而,尽管这些压缩模型在性能表现和系统层面具有优势,其可信赖性影响仍未得到充分理解。本文首次全面研究了低秩分解如何影响LLMs在隐私保护、对抗鲁棒性、公平性和伦理对齐方面的可信赖性。我们评估了多种不同规模与变体的LLMs经多种低秩算法压缩后的表现,揭示了关键发现:(1)低秩压缩能保持或提升训练数据隐私性,但会削弱对话过程中的个人身份信息保护;(2)对抗鲁棒性普遍得以保持且常被增强,即使在深度压缩下亦然;(3)伦理推理能力在零样本设置中会下降,但通过少量样本提示可部分恢复;(4)公平性在压缩过程中会降低。除压缩外,我们还研究了模型规模和微调如何影响可信赖性,因为这两者在低秩方法中均至关重要。为指导可信赖的压缩策略,我们在文末通过基于梯度的归因分析,识别出LLMs中对对抗鲁棒性贡献最大的层。