Rule-based language processing systems have been overshadowed by neural systems in terms of utility, but it remains unclear whether neural NLP systems, in practice, learn the grammar rules that humans use. This work aims to shed light on the issue by evaluating state-of-the-art LLMs in a task of morphological analysis of complex Finnish noun forms. We generate the forms using an FST tool, and they are unlikely to have occurred in the training sets of the LLMs, therefore requiring morphological generalisation capacity. We find that GPT-4-turbo has some difficulties in the task while GPT-3.5-turbo struggles and smaller models Llama2-70B and Poro-34B fail nearly completely.
翻译:基于规则的语言处理系统在实用性方面已被神经处理系统所超越,但神经自然语言处理系统在实践中是否习得了人类所使用的语法规则,这一问题仍不明确。本研究旨在通过评估顶尖大型语言模型在复杂芬兰语名词形态分析任务中的表现,对此问题提供启示。我们使用有限状态转录器工具生成这些名词形式,这些形式不太可能出现在大型语言模型的训练集中,因此需要模型具备形态学泛化能力。研究发现,GPT-4-turbo在该任务中遇到一定困难,GPT-3.5-turbo表现吃力,而较小规模的Llama2-70B和Poro-34B模型则几乎完全无法完成任务。