Evaluating the importance of different layers in large language models (LLMs) is crucial for optimizing model performance and interpretability. This paper first explores layer importance using the Activation Variance-Sparsity Score (AVSS), which combines normalized activation variance and sparsity to quantify each layer's contribution to overall model performance. By ranking layers based on AVSS and pruning the least impactful 25\%, our experiments on tasks such as question answering, language modeling, and sentiment classification show that over 90\% of the original performance is retained, highlighting potential redundancies in LLM architectures. Building on AVSS, we propose an enhanced version tailored to assess hallucination propensity across layers (EAVSS). This improved approach introduces Hallucination-Specific Activation Variance (HSAV) and Hallucination-Specific Sparsity (HSS) metrics, allowing precise identification of hallucination-prone layers. By incorporating contrastive learning on these layers, we effectively mitigate hallucination generation, contributing to more robust and efficient LLMs(The maximum performance improvement is 12\%). Our results on the NQ, SciQ, TriviaQA, TruthfulQA, and WikiQA datasets demonstrate the efficacy of this method, offering a comprehensive framework for both layer importance evaluation and hallucination mitigation in LLMs.
翻译:评估大语言模型(LLMs)中不同层的重要性对于优化模型性能和可解释性至关重要。本文首先利用激活方差稀疏性评分(AVSS)探究层重要性,该方法结合归一化激活方差与稀疏性来量化各层对整体模型性能的贡献。通过基于AVSS对层进行排序并剪枝影响最小的25%,我们在问答、语言建模和情感分类等任务上的实验表明,模型能保留超过90%的原始性能,这揭示了LLM架构中潜在冗余性。基于AVSS,我们进一步提出一种增强版本(EAVSS),专门用于评估各层的幻觉倾向。该改进方法引入了幻觉特异性激活方差(HSAV)与幻觉特异性稀疏性(HSS)指标,能够精准识别易产生幻觉的模型层。通过对这些层引入对比学习,我们有效抑制了幻觉生成,有助于构建更稳健高效的LLMs(最高性能提升达12%)。在NQ、SciQ、TriviaQA、TruthfulQA和WikiQA数据集上的实验结果验证了该方法的有效性,为LLMs的层重要性评估与幻觉缓解提供了一个系统化框架。