Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches in corpus linguistics. To address this, our study explores automating pragma-discursive corpus annotation using large language models (LLMs). We compare ChatGPT, the Bing chatbot, and a human coder in annotating apology components in English based on the local grammar framework. We find that the Bing chatbot outperformed ChatGPT, with accuracy approaching that of a human coder. These results suggest that AI can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient and scalable. Keywords: linguistic annotation, function-to-form approaches, large language models, local grammar analysis, Bing chatbot, ChatGPT
翻译:某些形式的语言标注,如词性标注和语义标注,可以以高准确度实现自动化。然而,对于缺乏直接词汇映射的复杂语用和话语特征,人工标注仍然是必要的。这一手动过程耗时且容易出错,限制了语料库语言学中功能-形式方法的可扩展性。为解决这一问题,本研究探索使用大语言模型自动化语用语篇语料库标注。我们比较了ChatGPT、必应聊天机器人以及人工编码员在基于局部语法框架标注英语道歉成分方面的表现。研究发现,必应聊天机器人的表现优于ChatGPT,其准确度接近人工编码员。这些结果表明,人工智能可以成功应用于辅助语用语篇语料库标注,使过程更加高效且可扩展。关键词:语言标注、功能-形式方法、大语言模型、局部语法分析、必应聊天机器人、ChatGPT