This study investigates the internal representations of verb-particle combinations within transformer-based large language models (LLMs), specifically examining how these models capture lexical and syntactic nuances at different neural network layers. Employing the BERT architecture, we analyse the representational efficacy of its layers for various verb-particle constructions such as 'agree on', 'come back', and 'give up'. Our methodology includes a detailed dataset preparation from the British National Corpus, followed by extensive model training and output analysis through techniques like multi-dimensional scaling (MDS) and generalized discrimination value (GDV) calculations. Results show that BERT's middle layers most effectively capture syntactic structures, with significant variability in representational accuracy across different verb categories. These findings challenge the conventional uniformity assumed in neural network processing of linguistic elements and suggest a complex interplay between network architecture and linguistic representation. Our research contributes to a better understanding of how deep learning models comprehend and process language, offering insights into the potential and limitations of current neural approaches to linguistic analysis. This study not only advances our knowledge in computational linguistics but also prompts further research into optimizing neural architectures for enhanced linguistic precision.
翻译:本研究探究了基于Transformer的大型语言模型(LLMs)内部对动词-小品词组合的表征机制,重点分析了这些模型在不同神经网络层如何捕捉词汇和句法层面的细微差异。我们采用BERT架构,系统分析了其各层对诸如'agree on'、'come back'和'give up'等各类动词-小品词结构的表征效能。研究方法包括:基于英国国家语料库的详细数据集构建,随后通过多维尺度分析(MDS)和广义判别值(GDV)计算等技术进行大规模模型训练与输出分析。结果表明,BERT的中间层最能有效捕捉句法结构,且不同动词类别的表征准确性存在显著差异。这些发现挑战了神经网络处理语言元素时传统上假设的均匀性,揭示了网络架构与语言表征之间复杂的相互作用。我们的研究有助于深化对深度学习模型理解和处理语言机制的认识,为当前神经语言学分析方法的潜力与局限提供了新的见解。此项工作不仅推动了计算语言学领域的发展,也为优化神经架构以提升语言处理精度提出了新的研究方向。