We aim to examine the extent to which Large Language Models (LLMs) can 'talk much' about grammar modules, providing evidence from syntax core properties translated by ChatGPT into Arabic. We collected 44 terms from generative syntax previous works, including books and journal articles, as well as from our experience in the field. These terms were translated by humans, and then by ChatGPT-5. We then analyzed and compared both translations. We used an analytical and comparative approach in our analysis. Findings unveil that LLMs still cannot 'talk much' about the core syntax properties embedded in the terms under study involving several syntactic and semantic challenges: only 25% of ChatGPT translations were accurate, while 38.6% were inaccurate, and 36.4.% were partially correct, which we consider appropriate. Based on these findings, a set of actionable strategies were proposed, the most notable of which is a close collaboration between AI specialists and linguists to better LLMs' working mechanism for accurate or at least appropriate translation.
翻译:本研究旨在评估大语言模型(LLMs)针对语法模块进行深度探讨的能力,以ChatGPT对句法学核心属性向阿拉伯语的翻译为实证依据。我们从生成句法学既有文献(含专著与期刊论文)及自身领域经验中,系统收集了44个术语。人类译员与ChatGPT-5分别完成这些术语的翻译,继而通过分析与比较双渠道译文展开研究,采用解析性与对比性研究方法。研究结果表明:大语言模型仍无法深入解析所涉术语蕴含的核心句法属性,其间涉及多重句法与语义挑战——ChatGPT仅能实现25%的准确翻译,38.6%存在偏差,36.4%呈现部分正确(本研究将其归为可接受范畴)。基于此发现,本文提出了一系列可操作性策略,其中最为关键的是倡导人工智能专家与语言学家建立密切协作关系,以优化大语言模型的工作机制,实现精准乃至可接受的翻译效果。