To understand how well a large language model captures certain semantic or syntactic features, researchers typically apply probing classifiers. However, the accuracy of these classifiers is critical for the correct interpretation of the results. If a probing classifier exhibits low accuracy, this may be due either to the fact that the language model does not capture the property under investigation, or to shortcomings in the classifier itself, which is unable to adequately capture the characteristics encoded in the internal representations of the model. Consequently, for more effective diagnosis, it is necessary to use the most accurate classifiers possible for a particular type of task. Logistic regression on the output representation of the transformer neural network layer is most often used to probing the syntactic properties of the language model. We show that using gradient boosting decision trees at the Knowledge Neuron layer, i.e., at the hidden layer of the feed-forward network of the transformer as a probing classifier for recognizing parts of a sentence is more advantageous than using logistic regression on the output representations of the transformer layer. This approach is also preferable to many other methods. The gain in error rate, depending on the preset, ranges from 9-54%
翻译:为理解大型语言模型对特定语义或句法特征的捕捉能力,研究者通常采用探针分类器。然而,这类分类器的准确性对正确解读结果至关重要。若探针分类器准确率偏低,可能源于语言模型未掌握所研究的属性,也可能归咎于分类器自身缺陷——其无法充分捕捉模型内部表征中编码的特征。因此,为更有效地诊断,必须针对特定任务类型使用尽可能准确的分类器。目前最常用的方法是基于Transformer神经网络层输出表示的逻辑回归,用于探测语言模型的句法特性。本文证明,在知识神经元层(即Transformer前馈网络的隐藏层)使用梯度提升决策树作为探针分类器识别句子成分,比在Transformer层输出表示上应用逻辑回归更具优势。该方法同样优于其他多种传统方案。根据预设条件,错误率可降低9%-54%。