In Artificial Intelligence (AI), language models have gained significant importance due to the widespread adoption of systems capable of simulating realistic conversations with humans through text generation. Because of their impact on society, developing and deploying these language models must be done responsibly, with attention to their negative impacts and possible harms. In this scenario, the number of AI Ethics Tools (AIETs) publications has recently increased. These AIETs are designed to help developers, companies, governments, and other stakeholders establish trust, transparency, and responsibility with their technologies by bringing accepted values to guide AI's design, development, and use stages. However, many AIETs lack good documentation, examples of use, and proof of their effectiveness in practice. This paper presents a methodology for evaluating AIETs in language models. Our approach involved an extensive literature survey on 213 AIETs, and after applying inclusion and exclusion criteria, we selected four AIETs: Model Cards, ALTAI, FactSheets, and Harms Modeling. For evaluation, we applied AIETs to language models developed for the Portuguese language, conducting 35 hours of interviews with their developers. The evaluation considered the developers' perspective on the AIETs' use and quality in helping to identify ethical considerations about their model. The results suggest that the applied AIETs serve as a guide for formulating general ethical considerations about language models. However, we note that they do not address unique aspects of these models, such as idiomatic expressions. Additionally, these AIETs did not help to identify potential negative impacts of models for the Portuguese language.
翻译:在人工智能(AI)领域,由于能够通过文本生成模拟人类真实对话的系统被广泛采用,语言模型已变得至关重要。鉴于其对社会的深远影响,开发和部署这些语言模型必须秉持负责任的态度,并关注其负面效应和潜在危害。在此背景下,AI伦理工具(AIETs)相关出版物数量近期有所增长。这些AIETs旨在帮助开发者、企业、政府及其他利益相关方,通过引入公认的价值观来指导AI的设计、开发和使用阶段,从而建立对其技术的信任、透明度和责任。然而,许多AIETs缺乏完善的文档、使用案例以及实践有效性的证据。本文提出了一种评估语言模型中AIETs的方法论。我们的方法涉及对213种AIETs进行广泛的文献调研,并依据纳入与排除标准筛选出四种AIETs:模型卡片、ALTAI、事实表和危害建模。为进行评估,我们将这些AIETs应用于针对葡萄牙语开发的语言模型,并对其开发者进行了共计35小时的访谈。评估从开发者视角考量了AIETs在辅助识别其模型伦理考量方面的使用效果与质量。结果表明,所应用的AIETs可作为制定语言模型一般伦理考量的指南。然而,我们注意到它们未能解决这类模型的独特方面,例如惯用表达。此外,这些AIETs未能帮助识别模型对葡萄牙语可能产生的负面影响。