Evaluating the Social Impact of Generative AI Systems in Systems and Society

Irene Solaiman,Zeerak Talat,William Agnew,Lama Ahmad,Dylan Baker,Su Lin Blodgett,Hal Daumé III,Jesse Dodge,Ellie Evans,Sara Hooker,Yacine Jernite,Alexandra Sasha Luccioni,Alberto Lusoli,Margaret Mitchell,Jessica Newman,Marie-Therese Png,Andrew Strait,Apostol Vassilev

Generative AI systems across modalities, ranging from text, image, audio, and video, have broad social impacts, but there exists no official standard for means of evaluating those impacts and which impacts should be evaluated. We move toward a standard approach in evaluating a generative AI system for any modality, in two overarching categories: what is able to be evaluated in a base system that has no predetermined application and what is able to be evaluated in society. We describe specific social impact categories and how to approach and conduct evaluations in the base technical system, then in people and society. Our framework for a base system defines seven categories of social impact: bias, stereotypes, and representational harms; cultural values and sensitive content; disparate performance; privacy and data protection; financial costs; environmental costs; and data and content moderation labor costs. Suggested methods for evaluation apply to all modalities and analyses of the limitations of existing evaluations serve as a starting point for necessary investment in future evaluations. We offer five overarching categories for what is able to be evaluated in society, each with their own subcategories: trustworthiness and autonomy; inequality, marginalization, and violence; concentration of authority; labor and creativity; and ecosystem and environment. Each subcategory includes recommendations for mitigating harm. We are concurrently crafting an evaluation repository for the AI research community to contribute existing evaluations along the given categories. This version will be updated following a CRAFT session at ACM FAccT 2023.

翻译：跨越文本、图像、音频和视频等模态的生成式AI系统具有广泛的社会影响，但目前尚未建立评估这些影响及其应评估维度的官方标准。我们致力于为任意模态的生成式AI系统建立标准化评估方法，将评估体系分为两大范畴：基础系统中可评估的内容（无预定应用场景）与社会系统中的可评估内容。我们具体阐述了社会影响类别，以及如何在基础技术系统、人群与社会层面开展评估工作。基础系统的评估框架定义了七类社会影响：偏见、刻板印象与表征危害；文化价值观与敏感内容；性能差异；隐私与数据保护；财务成本；环境成本；数据与内容审核劳动成本。建议的评估方法适用于所有模态，对现有评估局限性的分析可作为未来评估必要投资的起点。针对社会系统中的可评估内容，我们提出五大范畴及其子范畴：可信度与自主性；不平等、边缘化与暴力；权力集中；劳动与创造力；生态系统与环境。每个子范畴均包含危害减缓建议。我们正同步构建面向AI研究社区的评估资源库，鼓励研究者依据所定义范畴贡献现有评估方案。本版本将在ACM FAccT 2023的CRAFT研讨会上进行更新。