Artificial Intelligence (AI), particularly through the advent of large-scale generative AI (GenAI) models such as Large Language Models (LLMs), has become a transformative element in contemporary technology. While these models have unlocked new possibilities, they simultaneously present significant challenges, such as concerns over data privacy and the propensity to generate misleading or fabricated content. Current frameworks for Responsible AI (RAI) often fall short in providing the granular guidance necessary for tangible application, especially for Accountability-a principle that is pivotal for ensuring transparent and auditable decision-making, bolstering public trust, and meeting increasing regulatory expectations. This study bridges the accountability gap by introducing a comprehensive metrics catalogue, formulated through a systematic multivocal literature review (MLR) that integrates findings from both academic and grey literature. Our catalogue delineates process metrics that underpin procedural integrity, resource metrics that provide necessary tools and frameworks, and product metrics that reflect the outputs of AI systems. This tripartite framework is designed to operationalize Accountability in AI, with a special emphasis on addressing the intricacies of GenAI. The proposed metrics catalogue provides a robust framework for instilling Accountability in AI systems. It offers practical, actionable guidance for organizations, thereby shaping responsible practices in the field.
翻译:人工智能(AI),尤其是通过大规模生成式AI(GenAI)模型(如大型语言模型LLMs)的出现,已成为当代技术中的变革性要素。虽然这些模型开启了新的可能性,但它们同时带来了重大挑战,例如数据隐私担忧以及生成误导性或虚构内容的倾向。现有的负责任人工智能(RAI)框架往往缺乏实际应用所需的精细指导,尤其是在问责制方面——这一原则对于确保透明和可审计的决策、增强公众信任以及满足日益增长的监管期望至关重要。本研究通过引入一套全面的指标目录来弥合问责制差距,该目录通过系统性多声部文献综述(MLR)制定,整合了学术文献和灰色文献的发现。我们的目录界定了支撑程序完整性的过程指标、提供必要工具和框架的资源指标,以及反映AI系统输出的产品指标。这一三元框架旨在将AI问责制付诸实践,特别关注应对GenAI的复杂性。所提出的指标目录为在AI系统中灌输问责制提供了一个稳健的框架,为组织提供了实用的、可操作的指导,从而塑造该领域的负责任实践。