This guide introduces Large Language Models (LLM) as a highly versatile text analysis method within the social sciences. As LLMs are easy-to-use, cheap, fast, and applicable on a broad range of text analysis tasks, ranging from text annotation and classification to sentiment analysis and critical discourse analysis, many scholars believe that LLMs will transform how we do text analysis. This how-to guide is aimed at students and researchers with limited programming experience, and offers a simple introduction to how LLMs can be used for text analysis in your own research project, as well as advice on best practices. We will go through each of the steps of analyzing textual data with LLMs using Python: installing the software, setting up the API, loading the data, developing an analysis prompt, analyzing the text, and validating the results. As an illustrative example, we will use the challenging task of identifying populism in political texts, and show how LLMs move beyond the existing state-of-the-art.
翻译:本指南将大语言模型(LLM)引入社会科学领域,作为一种高度通用的文本分析方法。由于LLM易于使用、成本低廉、运行快速,且适用于从文本标注与分类到情感分析与批判性话语分析等广泛文本分析任务,许多学者认为LLM将改变我们进行文本分析的方式。本操作指南面向编程经验有限的学生和研究人员,简要介绍如何在自身研究项目中运用LLM进行文本分析,并提供最佳实践建议。我们将逐步讲解使用Python通过LLM分析文本数据的完整流程:安装软件、配置API、加载数据、设计分析提示、分析文本以及验证结果。以识别政治文本中的民粹主义这一挑战性任务为例,我们将展示LLM如何超越现有最优方法。