Recent advances in large language models (LLMs) like GPT-3 and GPT-4 have opened up new opportunities for text analysis in political science. They promise automation with better results and less programming. In this study, we evaluate LLMs on three original coding tasks of non-English political science texts, and we provide a detailed description of a general workflow for using LLMs for text coding in political science research. Our use case offers a practical guide for researchers looking to incorporate LLMs into their research on text analysis. We find that, when provided with detailed label definitions and coding examples, an LLM can be as good as or even better than a human annotator while being much faster (up to hundreds of times), considerably cheaper (costing up to 60% less than human coding), and much easier to scale to large amounts of text. Overall, LLMs present a viable option for most text coding projects.
翻译:大语言模型(如GPT-3和GPT-4)的最新进展为政治学中的文本分析开辟了新机遇。这些模型有望实现自动化分析,其结果更优且编程需求更少。在本研究中,我们针对三项非英语政治学文本的原始编码任务对大语言模型进行了评估,并详细描述了在政治学研究中利用大语言模型进行文本编码的通用工作流程。我们的案例为希望将大语言模型融入文本分析研究的研究人员提供了实践指南。研究发现,当提供详细的标签定义和编码示例时,大语言模型能够达到甚至超越人工标注者的表现,同时速度更快(最高可达数百倍)、成本更低(比人工编码节省高达60%的费用),且更易于扩展到大量文本。总体而言,大语言模型为大多数文本编码项目提供了可行的解决方案。