Navigating Uncertainties: Understanding How GenAI Developers Document Their Models on Open-Source Platforms

Model documentation plays a crucial role in promoting transparency and responsible development of AI systems. With the rise of Generative AI (GenAI), open-source platforms have increasingly become hubs for hosting and distributing these models, prompting platforms like Hugging Face to develop dedicated model documentation guidelines that align with responsible AI principles. Despite these growing efforts, there remains a lack of understanding of how developers document their GenAI models on open-source platforms. Through interviews with 13 GenAI developers active on open-source platforms, we provide empirical insights into their documentation practices and challenges. Our analysis reveals that despite existing resources, developers of GenAI models still face multiple layers of uncertainties in their model documentation: (1) uncertainties about what specific content should be included; (2) uncertainties about how to effectively report key components of their models; and (3) uncertainties in deciding who should take responsibilities for various aspects of model documentation. Based on our findings, we discuss the implications for policymakers, open-source platforms, and the research community to support meaningful, effective and actionable model documentation in the GenAI era, including cultivating better community norms, building robust evaluation infrastructures, and clarifying roles and responsibilities.

翻译：模型文档在促进人工智能系统的透明度和负责任开发方面发挥着至关重要的作用。随着生成式人工智能（GenAI）的兴起，开源平台日益成为托管和分发这些模型的中心，促使像Hugging Face这样的平台制定了专门的模型文档指南，以符合负责任的人工智能原则。尽管这些努力不断增加，但人们对于开发者如何在开源平台上记录其GenAI模型仍缺乏了解。通过对13位活跃在开源平台的GenAI开发者进行访谈，我们对其文档实践和挑战提供了实证性见解。我们的分析表明，尽管存在现有资源，GenAI模型的开发者在模型文档方面仍然面临多层不确定性：（1）不确定应包含哪些具体内容；（2）不确定如何有效报告其模型的关键组成部分；（3）不确定谁应承担模型文档各个方面的责任。基于我们的发现，我们讨论了这些发现对政策制定者、开源平台和研究界的启示，以支持在GenAI时代实现有意义、有效且可操作的模型文档，包括培育更好的社区规范、构建稳健的评估基础设施以及明确角色和责任。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

14+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

35+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日