Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches in corpus linguistics. To address this, our study explores automating pragma-discursive corpus annotation using large language models (LLMs). We compare ChatGPT, the Bing chatbot, and a human coder in annotating apology components in English based on the local grammar framework. We find that the Bing chatbot outperformed ChatGPT, with accuracy approaching that of a human coder. These results suggest that AI can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient and scalable. Keywords: linguistic annotation, function-to-form approaches, large language models, local grammar analysis, Bing chatbot, ChatGPT
翻译:某些语言标注形式(如词性标注和语义标注)可实现高精度自动化。然而,对于缺乏词汇形式直接映射的复杂语用和语篇特征,人工标注仍不可或缺。这一人工过程耗时且易出错,限制了语料语言学中“功能-形式”方法的可扩展性。为解决该问题,本研究探索利用大语言模型(LLMs)实现语用语篇语料标注的自动化。我们基于局部语法框架,比较了ChatGPT、必应聊天机器人与人工编码者在英语道歉成分标注中的表现。结果发现,必应聊天机器人的标注效果优于ChatGPT,准确率接近人工编码者。这些结果表明,人工智能可成功应用于辅助语用语篇语料标注,从而提升流程效率与可扩展性。关键词:语言标注;功能-形式方法;大语言模型;局部语法分析;必应聊天机器人;ChatGPT