Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods.
翻译:演绎编码是一种广泛使用的定性研究方法,用于确定主题在文档中的普遍程度。尽管有用,但演绎编码通常繁琐且耗时,因为它要求研究者阅读、解释并可靠地对大量非结构化文本文档进行分类。大型语言模型(如ChatGPT)是一类快速演进的AI工具,能够执行多种自然语言处理和推理任务。在本研究中,我们探索利用大型语言模型减少演绎编码所需时间,同时保持传统内容分析的灵活性。我们概述了所提出的方法——称为LLM辅助内容分析(LACA),并基于公开可用的演绎编码数据集,使用GPT-3.5进行LACA的深入案例研究。此外,我们使用LACA对4个公开数据集进行实证基准测试,以评估GPT-3.5在多种演绎编码任务中的整体表现。总体而言,我们发现GPT-3.5在演绎编码中的一致性水平常可与人类编码员相媲美。同时,我们证明LACA有助于优化演绎编码的提示、识别大型语言模型随机猜测的编码,并帮助评估何时使用大型语言模型vs.人类编码员进行演绎编码。最后,我们总结了该方法对未来演绎编码实践及相关研究方法的多项启示。