We aim to examine the extent to which Large Language Models (LLMs) can 'talk much' about grammar modules, providing evidence from syntax core properties translated by ChatGPT into Arabic. We collected 44 terms from generative syntax previous works, including books and journal articles, as well as from our experience in the field. These terms were translated by humans, and then by ChatGPT-5. We then analyzed and compared both translations. We used an analytical and comparative approach in our analysis. Findings unveil that LLMs still cannot 'talk much' about the core syntax properties embedded in the terms under study involving several syntactic and semantic challenges: only 25% of ChatGPT translations were accurate, while 38.6% were inaccurate, and 36.4.% were partially correct, which we consider appropriate. Based on these findings, a set of actionable strategies were proposed, the most notable of which is a close collaboration between AI specialists and linguists to better LLMs' working mechanism for accurate or at least appropriate translation.
翻译:我们旨在探讨大型语言模型(LLMs)在何种程度上能够“深入谈论”语法模块,基于ChatGPT将句法核心属性翻译为阿拉伯语时提供的证据。我们从生成句法的既有研究(包括书籍和期刊文章)以及我们的领域经验中收集了44个术语。这些术语首先由人类翻译,随后由ChatGPT-5翻译,我们通过分析性对比方法对两种翻译进行了剖析。研究结果表明,LLMs仍无法“深入谈论”所研究术语中蕴含的句法核心属性,这涉及多种句法及语义挑战:仅25%的ChatGPT翻译准确无误,38.6%存在错误,而36.4%为部分正确(我们将其认定为可接受)。基于此,我们提出了一系列切实可行的策略,其中最值得关注的是人工智能专家与语言学家之间的紧密协作,以优化LLMs的工作机制,从而实现准确或至少可接受的翻译。