The premise of this article is that a basic understanding of the composition and functioning of large language models is critically urgent. To that end, we extract a representational map of OpenAI's GPT-2 with what we articulate as two classes of deep learning code, that which pertains to the model and that which underwrites applications built around the model. We then verify this map through case studies of two popular GPT-2 applications: the text adventure game, AI Dungeon, and the language art project, This Word Does Not Exist. Such an exercise allows us to test the potential of Critical Code Studies when the object of study is deep learning code and to demonstrate the validity of code as an analytical focus for researchers in the subfields of Critical Artificial Intelligence and Critical Machine Learning Studies. More broadly, however, our work draws attention to the means by which ordinary users might interact with, and even direct, the behavior of deep learning systems, and by extension works toward demystifying some of the auratic mystery of "AI." What is at stake is the possibility of achieving an informed sociotechnical consensus about the responsible applications of large language models, as well as a more expansive sense of their creative capabilities-indeed, understanding how and where engagement occurs allows all of us to become more active participants in the development of machine learning systems.
翻译:本文的前提是,对大型语言模型的组成与运作机制具备基本理解已刻不容缓。为此,我们通过阐述两类深度学习代码——一类关乎模型本身,另一类支撑基于模型构建的应用——提取出OpenAI的GPT-2的具象化图谱。随后,我们通过两个流行的GPT-2应用案例验证该图谱:文字冒险游戏AI Dungeon和语言艺术项目This Word Does Not Exist。这一实践使我们得以检验当研究对象为深度学习代码时,批判性代码研究的潜力,并证明代码作为分析焦点对批判性人工智能与批判性机器学习研究子领域学者的有效性。但更广泛而言,我们的研究聚焦于普通用户可能交互甚至引导深度学习系统行为的方式,进而推动揭开"人工智能"某些神秘光环的面纱。其关键意义在于:关于大型语言模型负责任应用,我们有望达成具有信息基础的社会技术共识,并对其创意潜能形成更开阔的认识——实际上,理解交互发生的方式与位置,能使我们所有人都成为机器学习系统开发中更积极的参与者。