Thematic analysis (TA) has been widely used for analyzing qualitative data in many disciplines and fields. To ensure reliable analysis, the same piece of data is typically assigned to at least two human coders. Moreover, to produce meaningful and useful analysis, human coders develop and deepen their data interpretation and coding over multiple iterations, making TA labor-intensive and time-consuming. Recently the emerging field of large language models (LLMs) research has shown that LLMs have the potential replicate human-like behavior in various tasks: in particular, LLMs outperform crowd workers on text-annotation tasks, suggesting an opportunity to leverage LLMs on TA. We propose a human-LLM collaboration framework (i.e., LLM-in-the-loop) to conduct TA with in-context learning (ICL). This framework provides the prompt to frame discussions with a LLM (e.g., GPT-3.5) to generate the final codebook for TA. We demonstrate the utility of this framework using survey datasets on the aspects of the music listening experience and the usage of a password manager. Results of the two case studies show that the proposed framework yields similar coding quality to that of human coders but reduces TA's labor and time demands.
翻译:主题分析(TA)已被广泛应用于多个学科和领域的定性数据分析中。为确保分析的可靠性,相同的数据片段通常需要分配给至少两位人类编码员。此外,为了产出有意义且实用的分析结果,人类编码员需通过多次迭代来深化对数据的解读与编码,这使得TA既费时又费力。近年来,大语言模型(LLM)研究的新兴领域表明,LLM在多种任务中具备模拟人类行为的能力:特别是,LLM在文本标注任务上的表现优于众包工作者,这提示了利用LLM进行TA的机遇。我们提出了一种人机协作框架(即LLM-in-the-loop),结合上下文学习(ICL)来执行TA。该框架通过设计提示词与LLM(如GPT-3.5)进行交互,以生成最终的编码书。我们利用关于音乐聆听体验和密码管理器使用情况的调查数据集验证了该框架的有效性。两项案例研究的结果表明,所提出的框架能够实现与人类编码员相当的编码质量,同时减少了TA在劳动力和时间上的需求。