Documenting Ethical Considerations in Open Source AI Models

Background: The development of AI-enabled software heavily depends on AI model documentation, such as model cards, due to different domain expertise between software engineers and model developers. From an ethical standpoint, AI model documentation conveys critical information on ethical considerations along with mitigation strategies for downstream developers to ensure the delivery of ethically compliant software. However, knowledge on such documentation practice remains scarce. Aims: The objective of our study is to investigate how developers document ethical aspects of open source AI models in practice, aiming at providing recommendations for future documentation endeavours. Method: We selected three sources of documentation on GitHub and Hugging Face, and developed a keyword set to identify ethics-related documents systematically. After filtering an initial set of 2,347 documents, we identified 265 relevant ones and performed thematic analysis to derive the themes of ethical considerations. Results: Six themes emerge, with the three largest ones being model behavioural risks, model use cases, and model risk mitigation. Conclusions: Our findings reveal that open source AI model documentation focuses on articulating ethical problem statements and use case restrictions. We further provide suggestions to various stakeholders for improving documentation practice regarding ethical considerations.

翻译：背景：由于软件工程师与模型开发者之间的领域专业知识差异，人工智能赋能软件的开发高度依赖于模型卡片等人工智能模型文档。从伦理角度来看，人工智能模型文档向下游开发者传递了关于伦理考量的关键信息及缓解策略，以确保交付符合伦理规范的软件。然而，关于此类文档实践的知识仍然匮乏。目的：本研究旨在探究开发者在实践中如何记录开源人工智能模型的伦理层面，以期对未来文档工作提供建议。方法：我们选取了GitHub和Hugging Face上的三类文档来源，并开发了一套关键词集以系统识别伦理相关文档。在筛选了初始的2,347份文档后，我们确定了265份相关文档，并通过主题分析归纳出伦理考量的主题类别。结果：研究共浮现出六大主题，其中规模最大的三个主题分别是模型行为风险、模型使用场景和模型风险缓解。结论：我们的发现表明，开源人工智能模型文档侧重于阐明伦理问题陈述和使用场景限制。我们进一步为各利益相关方提供了改进伦理考量文档实践的建议。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日