Large Language Models (LLMs) exhibit impressive performance on a range of NLP tasks, due to the general-purpose linguistic knowledge acquired during pretraining. Existing model interpretability research (Tenney et al., 2019) suggests that a linguistic hierarchy emerges in the LLM layers, with lower layers better suited to solving syntactic tasks and higher layers employed for semantic processing. Yet, little is known about how encodings of different linguistic phenomena interact within the models and to what extent processing of linguistically-related categories relies on the same, shared model representations. In this paper, we propose a framework for testing the joint encoding of linguistic categories in LLMs. Focusing on syntax, we find evidence of joint encoding both at the same (related part-of-speech (POS) classes) and different (POS classes and related syntactic dependency relations) levels of linguistic hierarchy. Our cross-lingual experiments show that the same patterns hold across languages in multilingual LLMs.
翻译:大语言模型(LLMs)凭借预训练过程中习得的通用语言知识,在自然语言处理任务中展现出卓越性能。现有模型可解释性研究(Tenney 等人,2019)表明,LLM各层会形成语言层级结构:底层更适合处理句法任务,而顶层则用于语义加工。然而,不同语言现象的编码如何在模型内部相互作用,以及语言学相关类别的处理在多大程度上依赖相同的共享模型表征,目前尚不明确。本文提出一个用于检验LLMs中语言类别联合编码的框架。聚焦句法层面,我们发现在语言层级结构的同一层级(相关词性类别)与不同层级(词性类别及相关句法依存关系)均存在联合编码证据。跨语言实验表明,多语言LLMs中不同语言均保持相同的编码模式。