While recent advancements in large language models (LLMs) bring us closer to achieving artificial general intelligence, the question persists: Do LLMs truly understand language, or do they merely mimic comprehension through pattern recognition? This study seeks to explore this question through the lens of syntax, a crucial component of sentence comprehension. Adopting a natural language question-answering (Q&A) scheme, we craft questions targeting nine syntactic knowledge points that are most closely related to sentence comprehension. Experiments conducted on 24 LLMs suggest that most have a limited grasp of syntactic knowledge, exhibiting notable discrepancies across different syntactic knowledge points. In particular, questions involving prepositional phrase attachment pose the greatest challenge, whereas those concerning adjectival modifier and indirect object are relatively easier for LLMs to handle. Furthermore, a case study on the training dynamics of the LLMs reveals that the majority of syntactic knowledge is learned during the initial stages of training, hinting that simply increasing the number of training tokens may not be the `silver bullet' for improving the comprehension ability of LLMs.
翻译:尽管大型语言模型(LLMs)的最新进展使我们更接近实现通用人工智能,但一个核心问题依然存在:LLMs是真正理解语言,还是仅通过模式识别模仿理解能力?本研究从句法这一句子理解的关键维度出发,通过自然语言问答(Q&A)范式,构建了针对九种与句子理解最密切相关的句法知识点的提问方案。对24个LLMs的实验表明,大多数模型对句法知识的掌握有限,且在不同句法知识点间存在显著差异。其中,涉及介词短语依附结构的问题最具挑战性,而形容词修饰语和间接宾语相关问题对LLMs而言相对简单。此外,针对LLMs训练动态的案例研究揭示,大部分句法知识在训练初期即已习得,这暗示单纯增加训练数据量可能并非提升LLMs理解能力的“万能钥匙”。