We introduce a novel analysis that leverages linguistic minimal pairs to probe the internal linguistic representations of Large Language Models (LLMs). By measuring the similarity between LLM activation differences across minimal pairs, we quantify the and gain insight into the linguistic knowledge captured by LLMs. Our large-scale experiments, spanning 100+ LLMs and 150k minimal pairs in three languages, reveal properties of linguistic similarity from four key aspects: consistency across LLMs, relation to theoretical categorizations, dependency to semantic context, and cross-lingual alignment of relevant phenomena. Our findings suggest that 1) linguistic similarity is significantly influenced by training data exposure, leading to higher cross-LLM agreement in higher-resource languages. 2) Linguistic similarity strongly aligns with fine-grained theoretical linguistic categories but weakly with broader ones. 3) Linguistic similarity shows a weak correlation with semantic similarity, showing its context-dependent nature. 4) LLMs exhibit limited cross-lingual alignment in their understanding of relevant linguistic phenomena. This work demonstrates the potential of minimal pairs as a window into the neural representations of language in LLMs, shedding light on the relationship between LLMs and linguistic theory. Codes and data are available at https://github.com/ChenDelong1999/Linguistic-Similarity
翻译:我们提出一种新颖的分析方法,利用语言最小对比对来探究大型语言模型(LLMs)的内部语言表征。通过测量LLMs在最小对比对上的激活差异之间的相似性,我们量化并深入理解了LLMs所捕获的语言知识。我们的大规模实验覆盖了100多个LLMs和三种语言的15万个最小对比对,从四个关键方面揭示了语言相似性的特性:跨LLMs的一致性、与理论分类的关系、对语义语境的依赖性以及相关现象的跨语言对齐。我们的研究结果表明:1)语言相似性显著受训练数据暴露程度的影响,导致在高资源语言中跨LLM一致性更高;2)语言相似性与细粒度的理论语言学类别高度一致,但与粗粒度类别关联较弱;3)语言相似性与语义相似性仅呈现弱相关,显示了其语境依赖特性;4)LLMs在对相关语言现象的理解上表现出有限的跨语言对齐能力。这项工作证明了最小对比对作为探究LLMs中语言神经表征窗口的潜力,为理解LLMs与语言学理论之间的关系提供了新的视角。代码与数据可在 https://github.com/ChenDelong1999/Linguistic-Similarity 获取。