The usual way to interpret language models (LMs) is to test their performance on different benchmarks and subsequently infer their internal processes. In this paper, we present an alternative approach, concentrating on the quality of LM processing, with a focus on their language abilities. To this end, we construct 'linguistic task spaces' -- representations of an LM's language conceptualisation -- that shed light on the connections LMs draw between language phenomena. Task spaces are based on the interactions of the learning signals from different linguistic phenomena, which we assess via a method we call 'similarity probing'. To disentangle the learning signals of linguistic phenomena, we further introduce a method called 'fine-tuning via gradient differentials' (FTGD). We apply our methods to language models of three different scales and find that larger models generalise better to overarching general concepts for linguistic tasks, making better use of their shared structure. Further, the distributedness of linguistic processing increases with pre-training through increased parameter sharing between related linguistic tasks. The overall generalisation patterns are mostly stable throughout training and not marked by incisive stages, potentially explaining the lack of successful curriculum strategies for LMs.
翻译:解释语言模型(LMs)的常规方法是测试其在不同基准上的性能,进而推断其内部处理过程。本文提出一种替代方案,重点关注语言模型处理过程的质量,尤其聚焦于其语言能力。为此,我们构建了“语言任务空间”——即语言模型对语言概念化过程的表征——以揭示语言模型在不同语言现象之间建立的关联。任务空间基于不同语言现象产生的学习信号之间的相互作用,我们通过一种称为“相似性探测”的方法对其进行评估。为解耦语言现象的学习信号,我们进一步提出一种称为“基于梯度微分的微调”(FTGD)的方法。我们将所提方法应用于三种不同规模的语言模型,发现更大规模的模型能更好地泛化至语言任务的整体通用概念,从而更有效地利用其共享结构。此外,随着预训练的进行,语言处理的分布式特性通过相关语言任务间增强的参数共享而得到提升。整体泛化模式在训练过程中基本保持稳定,未出现明显的阶段性突变,这或许解释了为何针对语言模型的课程学习策略至今未能取得显著成功。