Research and industry are rapidly advancing the innovation and adoption of foundation model-based systems, yet the tools for managing these models have not kept pace. Understanding the provenance and lineage of models is critical for researchers, industry, regulators, and public trust. While model cards and system cards were designed to provide transparency, they fall short in key areas: tracing model genealogy, enabling machine readability, offering reliable centralized management systems, and fostering consistent creation incentives. This challenge mirrors issues in software supply chain security, but AI/ML remains at an earlier stage of maturity. Addressing these gaps requires industry-standard tooling that can be adopted by foundation model publishers, open-source model innovators, and major distribution platforms. We propose a machine-readable model specification format to simplify the creation of model records, thereby reducing error-prone human effort, notably when a new model inherits most of its design from a foundation model. Our solution explicitly traces relationships between upstream and downstream models, enhancing transparency and traceability across the model lifecycle. To facilitate the adoption, we introduce the unified model record (UMR) repository , a semantically versioned system that automates the publication of model records to multiple formats (PDF, HTML, LaTeX) and provides a hosted web interface (https://modelrecord.com/). This proof of concept aims to set a new standard for managing foundation models, bridging the gap between innovation and responsible model management.
翻译:研究与产业界正快速推进基于基础模型的系统创新与应用,然而相应的模型管理工具尚未同步发展。理解模型的来源与谱系对于研究人员、产业界、监管机构及公众信任至关重要。尽管模型卡片与系统卡片旨在提供透明度,但其在关键领域存在不足:追溯模型谱系、实现机器可读性、提供可靠的集中管理系统以及建立一致的创建激励机制。这一挑战与软件供应链安全问题相呼应,但人工智能/机器学习领域在此方面仍处于早期发展阶段。解决这些缺口需要能被基础模型发布者、开源模型创新者及主流分发平台采用的行业标准工具。我们提出一种机器可读的模型规范格式,以简化模型记录的创建过程,从而减少易出错的人工操作——尤其当新模型的设计主要继承自基础模型时。我们的解决方案明确追踪上下游模型间的关联关系,增强模型全生命周期的透明度与可追溯性。为促进该方案的采用,我们引入统一模型记录(UMR)存储库,这是一个采用语义版本控制的系统,可自动将模型记录发布为多种格式(PDF、HTML、LaTeX),并提供托管网络界面(https://modelrecord.com/)。该概念验证旨在为管理基础模型设立新标准,弥合技术创新与负责任模型管理之间的鸿沟。