Deductive coding is a common discourse analysis method widely used by learning science and learning analytics researchers for understanding teaching and learning interactions. It often requires researchers to manually label all discourses to be analyzed according to a theoretically guided coding scheme, which is time-consuming and labor-intensive. The emergence of large language models such as GPT has opened a new avenue for automatic deductive coding to overcome the limitations of traditional deductive coding. To evaluate the usefulness of large language models in automatic deductive coding, we employed three different classification methods driven by different artificial intelligence technologies, including the traditional text classification method with text feature engineering, BERT-like pretrained language model and GPT-like pretrained large language model (LLM). We applied these methods to two different datasets and explored the potential of GPT and prompt engineering in automatic deductive coding. By analyzing and comparing the accuracy and Kappa values of these three classification methods, we found that GPT with prompt engineering outperformed the other two methods on both datasets with limited number of training samples. By providing detailed prompt structures, the reported work demonstrated how large language models can be used in the implementation of automatic deductive coding.
翻译:归纳编码是一种常用的话语分析方法,被学习科学与学习分析领域的研究者广泛用于理解教学互动过程。该方法通常要求研究者依据理论指导的编码方案,对需分析的所有话语进行人工标注,这一过程耗时且费力。以GPT为代表的大语言模型的出现,为自动归纳编码开辟了新途径,以克服传统归纳编码的局限性。为评估大语言模型在自动归纳编码中的实用性,我们采用了三种基于不同人工智能技术的分类方法,包括基于文本特征工程的传统文本分类方法、类似BERT的预训练语言模型以及类似GPT的预训练大语言模型(LLM)。我们将这些方法应用于两个不同的数据集,并探索了GPT及提示工程在自动归纳编码中的潜力。通过分析和比较这三种分类方法的准确率与Kappa值,我们发现:在训练样本数量有限的情况下,采用提示工程的GPT模型在两个数据集上的表现均优于其他两种方法。本研究通过提供详细的提示结构设计,展示了大语言模型在自动归纳编码实践中的具体应用路径。