Assessing the potential of LLM-assisted annotation for corpus-based pragmatics and discourse analysis: The case of apology

Certain forms of linguistic annotation, like part of speech and semantic tagging, can be automated with high accuracy. However, manual annotation is still necessary for complex pragmatic and discursive features that lack a direct mapping to lexical forms. This manual process is time-consuming and error-prone, limiting the scalability of function-to-form approaches in corpus linguistics. To address this, our study explores the possibility of using large language models (LLMs) to automate pragma-discursive corpus annotation. We compare GPT-3.5 (the model behind the free-to-use version of ChatGPT), GPT-4 (the model underpinning the precise mode of Bing chatbot), and a human coder in annotating apology components in English based on the local grammar framework. We find that GPT-4 outperformed GPT-3.5, with accuracy approaching that of a human coder. These results suggest that LLMs can be successfully deployed to aid pragma-discursive corpus annotation, making the process more efficient, scalable and accessible.

翻译：某些形式的语言标注，如词性标注和语义标注，已能实现高精度自动化。然而，对于缺乏词汇形式直接映射的复杂语用和话语特征，仍需人工标注。这种人工过程耗时且易出错，限制了功能-形式方法在语料库语言学中的可扩展性。为解决此问题，本研究探索利用大语言模型（LLMs）实现语用-话语语料库标注自动化的可能性。我们基于局部语法框架，比较了GPT-3.5（免费版ChatGPT的底层模型）、GPT-4（必应聊天机器人精确模式的支撑模型）与人工编码者在英语道歉成分标注上的表现。研究发现GPT-4优于GPT-3.5，其准确率接近人类编码者水平。这些结果表明，LLMs可成功应用于辅助语用-话语语料库标注，使该过程更高效、可扩展且易于实施。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【KDD2021】多层次领域知识在分子图上的对比学习

专知会员服务

39+阅读 · 2021年6月13日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【AI应用】Facebook-利用神经网络求解高等数学方程, Using neural networks to solve advanced mathematics equations

专知会员服务

34+阅读 · 2020年1月15日