The capacity of LLMs to carry out automated qualitative analysis has been questioned by corpus linguists, and it has been argued that corpus-based discourse analysis incorporating LLMs is hindered by issues of unsatisfying performance, hallucination, and irreproducibility. Our proposed method, TACOMORE, aims to address these concerns by serving as an effective prompting framework in this domain. The framework consists of four principles, i.e., Task, Context, Model and Reproducibility, and specifies five fundamental elements of a good prompt, i.e., Role Description, Task Definition, Task Procedures, Contextual Information and Output Format. We conduct experiments on three LLMs, i.e., GPT-4o, Gemini-1.5-Pro and Gemini-1.5.Flash, and find that TACOMORE helps improve LLM performance in three representative discourse analysis tasks, i.e., the analysis of keywords, collocates and concordances, based on an open corpus of COVID-19 research articles. Our findings show the efficacy of the proposed prompting framework TACOMORE in corpus-based discourse analysis in terms of Accuracy, Ethicality, Reasoning, and Reproducibility, and provide novel insights into the application and evaluation of LLMs in automated qualitative studies.
翻译:大语言模型执行自动化定性分析的能力一直受到语料库语言学家的质疑,有观点认为,将大语言模型融入基于语料库的话语分析会受限于性能不佳、幻觉和不可复现等问题。我们提出的方法TACOMORE旨在通过在该领域提供一个有效的提示框架来解决这些问题。该框架包含四项原则,即任务、上下文、模型与可复现性,并规定了一个优质提示应具备的五个基本要素,即角色描述、任务定义、任务流程、上下文信息和输出格式。我们在三个大语言模型(即GPT-4o、Gemini-1.5-Pro和Gemini-1.5-Flash)上进行了实验,发现基于一个开放的COVID-19研究文章语料库,TACOMORE有助于提升大语言模型在三个代表性话语分析任务(即关键词、搭配词和共现语境分析)中的表现。我们的研究结果表明,所提出的提示框架TACOMORE在基于语料库的话语分析中,在准确性、伦理性、推理能力和可复现性方面均表现出色,并为大语言模型在自动化定性研究中的应用与评估提供了新的见解。