The effects of generative AI are experienced by a broad range of constituencies, but the disciplinary inputs to its development have been surprisingly narrow. Here we present a set of provocations from humanities researchers -- currently underrepresented in AI development -- intended to inform its future applications and enrich ongoing conversations about its uses, impact, and harms. Drawing from relevant humanities scholarship, along with foundational work in critical data studies, we elaborate eight claims with broad applicability to generative AI research: 1) Models make words, but people make meaning; 2) Generative AI requires an expanded definition of culture; 3) Generative AI can never be representative; 4) Bigger models are not always better models; 5) Not all training data is equivalent; 6) Openness is not an easy fix; 7) Limited access to compute enables corporate capture; and 8) AI universalism creates narrow human subjects. We also provide a working definition of humanities research, summarize some of its most salient theories and methods, and apply these theories and methods to the current landscape of AI. We conclude with a discussion of the importance of resisting the extraction of humanities research by computer science and related fields.
翻译:生成式人工智能的影响波及广泛的群体,但其发展的学科基础却出人意料地狭窄。本文提出一系列来自人文学科研究者的启示——这些研究者目前在人工智能开发领域代表性不足——旨在为生成式人工智能的未来应用提供参考,并丰富当前关于其用途、影响与危害的讨论。借鉴相关人文学科学术成果及批判性数据研究的基础工作,我们阐述了八个对生成式人工智能研究具有广泛适用性的主张:1)模型生成词汇,但人类创造意义;2)生成式人工智能需要扩展的文化定义;3)生成式人工智能永远无法具有代表性;4)更大规模的模型未必是更好的模型;5)训练数据并非等质等价;6)开放性并非简易解决方案;7)有限的计算资源助长企业垄断;8)人工智能普世主义塑造狭隘的人类主体。我们同时提供了人文学科研究的工作定义,总结了其最突出的理论与方法,并将这些理论方法应用于当前人工智能发展格局。最后,我们探讨了抵制计算机科学及相关领域对人文学科研究进行资源攫取的重要性。