A Scoping Study of Evaluation Practices for Responsible AI Tools: Steps Towards Effectiveness Evaluations

from arxiv, Accepted for publication in Proceedings of the CHI Conference on Human Factors in Computing Systems (CHI '24), May 11--16, 2024, Honolulu, HI, USA

Responsible design of AI systems is a shared goal across HCI and AI communities. Responsible AI (RAI) tools have been developed to support practitioners to identify, assess, and mitigate ethical issues during AI development. These tools take many forms (e.g., design playbooks, software toolkits, documentation protocols). However, research suggests that use of RAI tools is shaped by organizational contexts, raising questions about how effective such tools are in practice. To better understand how RAI tools are -- and might be -- evaluated, we conducted a qualitative analysis of 37 publications that discuss evaluations of RAI tools. We find that most evaluations focus on usability, while questions of tools' effectiveness in changing AI development are sidelined. While usability evaluations are an important approach to evaluate RAI tools, we draw on evaluation approaches from other fields to highlight developer- and community-level steps to support evaluations of RAI tools' effectiveness in shaping AI development practices and outcomes.

翻译：负责任的AI系统设计是人机交互与人工智能领域的共同目标。为支持从业者在AI开发过程中识别、评估和缓解伦理问题，现已开发出负责任AI工具。这些工具形式多样（如设计手册、软件工具包、文档协议）。然而研究表明，负责任AI工具的使用受组织情境影响，这引发了关于此类工具在实践中有效性的疑问。为深入理解负责任AI工具当前及潜在的评估方式，我们对37篇讨论负责任AI工具评估的出版物进行了定性分析。结果发现，大多数评估聚焦于可用性，而工具在改变AI开发过程中的有效性问题却被边缘化。虽然可用性评估是评估负责任AI工具的重要方法，但我们借鉴其他领域的评估方法，从开发者层面和社区层面提出了支持评估负责任AI工具在塑造AI开发实践和成果方面的有效性的步骤。

相关内容

TOOLS

关注 1

这个新版本的工具会议系列恢复了从1989年到2012年的50个会议的传统。工具最初是“面向对象语言和系统的技术”，后来发展到包括软件技术的所有创新方面。今天许多最重要的软件概念都是在这里首次引入的。2019年TOOLS 50+1在俄罗斯喀山附近举行，以同样的创新精神、对所有与软件相关的事物的热情、科学稳健性和行业适用性的结合以及欢迎该领域所有趋势和社区的开放态度，延续了该系列。官网链接：http://tools2019.innopolis.ru/

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日