This study explores the mechanism of factual knowledge storage in pre-trained language models (PLMs). Previous research suggests that factual knowledge is stored within multi-layer perceptron weights, and some storage units exhibit degeneracy, referred to as Degenerate Knowledge Neurons (DKNs). This paper provides a comprehensive definition of DKNs that covers both structural and functional aspects, pioneering the study of structures in PLMs' factual knowledge storage units. Based on this, we introduce the Neurological Topology Clustering method, which allows the formation of DKNs in any numbers and structures, leading to a more accurate DKN acquisition. Furthermore, we introduce the Neuro-Degeneracy Analytic Analysis Framework, which uniquely integrates model robustness, evolvability, and complexity for a holistic assessment of PLMs. Within this framework, our execution of 34 experiments across 2 PLMs, 4 datasets, and 6 settings highlights the critical role of DKNs. The code will be available soon.
翻译:本研究探讨了预训练语言模型中事实知识存储的机制。以往研究表明,事实知识存储于多层感知机权重中,部分存储单元存在退化现象,被称为退化知识神经元。本文从结构与功能两个维度对退化知识神经元进行了全面定义,首次系统研究了预训练语言模型事实知识存储单元的结构。在此基础上,我们提出神经拓扑聚类方法,该方法允许任意数量与结构的退化知识神经元形成,从而更精确地获取退化知识神经元。此外,我们引入神经退化分析框架,该框架独特地整合了模型鲁棒性、进化能力与复杂性,实现对预训练语言模型的全面评估。在该框架下,我们在2个预训练语言模型、4个数据集及6种设置下执行了34项实验,凸显了退化知识神经元的关键作用。代码即将公开。