Artificial Intelligence (AI), particularly through the advent of large-scale generative AI (GenAI) models such as Large Language Models (LLMs), has become a transformative element in contemporary technology. While these models have unlocked new possibilities, they simultaneously present significant challenges, such as concerns over data privacy and the propensity to generate misleading or fabricated content. Current frameworks for Responsible AI (RAI) often fall short in providing the granular guidance necessary for tangible application, especially for Accountability-a principle that is pivotal for ensuring transparent and auditable decision-making, bolstering public trust, and meeting increasing regulatory expectations. This study bridges the accountability gap by introducing our effort towards a comprehensive metrics catalogue, formulated through a systematic multivocal literature review (MLR) that integrates findings from both academic and grey literature. Our catalogue delineates process metrics that underpin procedural integrity, resource metrics that provide necessary tools and frameworks, and product metrics that reflect the outputs of AI systems. This tripartite framework is designed to operationalize Accountability in AI, with a special emphasis on addressing the intricacies of GenAI.
翻译:人工智能(AI),特别是通过大规模生成式AI(GenAI)模型(如大型语言模型,LLMs)的问世,已成为当代技术中的变革性要素。尽管这些模型开启了新的可能性,但它们同时带来了重大挑战,例如数据隐私担忧以及生成误导性或捏造内容的倾向。当前负责任AI(RAI)的框架往往缺乏应用于具体实践的粒度指导,尤其对于"问责"这一原则——该原则对于确保透明且可审计的决策、增强公众信任以及满足日益增长的监管期望至关重要。本研究通过引入我们构建综合指标目录的初步努力来填补问责性缺口,该目录基于系统性多声部文献综述(MLR),整合了学术文献与灰色文献的发现。我们的目录界定了支撑程序完整性的流程指标、提供必要工具与框架的资源指标,以及反映AI系统产出的产品指标。这一三元框架旨在将AI问责性付诸实践,并特别关注解决生成式AI的复杂性。