Large Language Models (LLMs) are promising analytical tools. They can augment human epistemic, cognitive and reasoning abilities, and support 'sensemaking', making sense of a complex environment or subject by analysing large volumes of data with a sensitivity to context and nuance absent in earlier text processing systems. This paper presents a pilot experiment that explores how LLMs can support thematic analysis of controversial topics. We compare how human researchers and two LLMs GPT-4 and Llama 2 categorise excerpts from media coverage of the controversial Australian Robodebt scandal. Our findings highlight intriguing overlaps and variances in thematic categorisation between human and machine agents, and suggest where LLMs can be effective in supporting forms of discourse and thematic analysis. We argue LLMs should be used to augment, and not replace human interpretation, and we add further methodological insights and reflections to existing research on the application of automation to qualitative research methods. We also introduce a novel card-based design toolkit, for both researchers and practitioners to further interrogate LLMs as analytical tools.
翻译:大语言模型(LLMs)是有前景的分析工具。它们能够增强人类的认知、推理和论证能力,并通过分析大量数据来支持"意义建构"——即对复杂环境或主题的理解,其语境敏感性和细微差别的把握超越了早期的文本处理系统。本文通过一项试点实验,探索了LLMs如何支持对争议性话题的主题分析。我们比较了人类研究者与两种LLM(GPT-4和Llama 2)对澳大利亚颇具争议的"Robodebt"丑闻媒体报道摘录进行分类的过程。研究结果揭示了人类与机器主体在主题分类上存在有趣的重叠与差异,并指出了LLMs在支持话语分析与主题分析方面的有效性。我们主张,LLMs应被用于增强而非取代人类解释,并为现有关于自动化应用于定性研究方法的研究提供了进一步的方法论见解与反思。此外,我们引入了一种新颖的基于卡片的设计工具包,供研究者和实践者进一步探究LLMs作为分析工具的潜力。