The increasing capacities of large language models (LLMs) present an unprecedented opportunity to scale up data analytics in the humanities and social sciences, augmenting and automating qualitative analytic tasks previously typically allocated to human labor. This contribution proposes a systematic mixed methods framework to harness qualitative analytic expertise, machine scalability, and rigorous quantification, with attention to transparency and replicability. 16 machine-assisted case studies are showcased as proof of concept. Tasks include linguistic and discourse analysis, lexical semantic change detection, interview analysis, historical event cause inference and text mining, detection of political stance, text and idea reuse, genre composition in literature and film; social network inference, automated lexicography, missing metadata augmentation, and multimodal visual cultural analytics. In contrast to the focus on English in the emerging LLM applicability literature, many examples here deal with scenarios involving smaller languages and historical texts prone to digitization distortions. In all but the most difficult tasks requiring expert knowledge, generative LLMs can demonstrably serve as viable research instruments. LLM (and human) annotations may contain errors and variation, but the agreement rate can and should be accounted for in subsequent statistical modeling; a bootstrapping approach is discussed. The replications among the case studies illustrate how tasks previously requiring potentially months of team effort and complex computational pipelines, can now be accomplished by an LLM-assisted scholar in a fraction of the time. Importantly, this approach is not intended to replace, but to augment researcher knowledge and skills. With these opportunities in sight, qualitative expertise and the ability to pose insightful questions have arguably never been more critical.
翻译:大型语言模型(LLMs)不断增强的能力为人文与社会科学领域的数据分析规模化提供了前所未有的机遇,能够增强并自动化此前通常由人类劳动力承担的定性分析任务。本文提出了一套系统的混合方法框架,旨在整合定性分析专业知识、机器可扩展性及严格量化,同时兼顾透明性与可复现性。16项机器辅助案例研究作为概念验证加以展示,任务涵盖:语言与话语分析、词汇语义变化检测、访谈分析、历史事件因果推断与文本挖掘、政治立场检测、文本与思想复用、文学与电影体裁构成分析、社交网络推断、自动化词典编纂、缺失元数据增强,以及多模态视觉文化分析。与新兴LLM应用文献中普遍聚焦于英语不同,本文中的许多案例涉及小语种及易受数字化失真的历史文本。在除极少数需要专家知识的复杂任务之外,生成式LLM已被证明可充当有效的研究工具。LLM(及人类)标注可能存在误差与变异,但在后续统计建模中必须且能够合理处理一致率问题;本文讨论了自助法(bootstrap)方法。案例研究中的复现结果说明,此前需要数月团队协作及复杂计算流程的任务,现在可由LLM辅助的研究者在极短时间内完成。重要之处在于,该方法并非旨在取代,而是增强研究者的知识与技能。面对这些机遇,定性分析能力与提出深刻问题的能力无疑是比以往任何时候都更为关键的。