Artificial Intelligence (AI) models, encompassing both traditional machine learning (ML) and more advanced approaches such as deep learning and large language models (LLMs), play a central role in modern applications. AI model lifecycle management involves the end-to-end process of managing these models, from data collection and preparation to model building, evaluation, deployment, and continuous monitoring. This process is inherently complex, as it requires the coordination of diverse services that manage AI artifacts such as datasets, dataflows, and models, all orchestrated to operate seamlessly. In this context, it is essential to isolate applications from the complexity of interacting with heterogeneous services, datasets, and AI platforms. In this paper, we introduce Gypscie, a cross-platform AI artifact management system. By providing a unified view of all AI artifacts, the Gypscie platform simplifies the development and deployment of AI applications. This unified view is realized through a knowledge graph that captures application semantics and a rule-based query language that supports reasoning over data and models. Model lifecycle activities are represented as high-level dataflows that can be scheduled across multiple platforms, such as servers, cloud platforms, or supercomputers. Finally, Gypscie records provenance information about the artifacts it produces, thereby enabling explainability. Our qualitative comparison with representative AI systems shows that Gypscie supports a broader range of functionalities across the AI artifact lifecycle. Our experimental evaluation demonstrates that Gypscie can successfully optimize and schedule dataflows on AI platforms from an abstract specification.
翻译:人工智能(AI)模型,包括传统机器学习(ML)以及深度学习和大语言模型(LLMs)等更先进的方法,在现代应用中扮演着核心角色。AI模型生命周期管理涉及管理这些模型的端到端流程,从数据收集和准备到模型构建、评估、部署和持续监控。该流程本质上是复杂的,因为它需要协调管理数据集、数据流和模型等各种AI制品的多种服务,所有这些服务都被编排起来以无缝运行。在此背景下,将应用程序与处理异构服务、数据集和AI平台的复杂性隔离开来至关重要。本文介绍了Gypscie——一种跨平台的AI制品管理系统。通过提供所有AI制品的统一视图,Gypscie平台简化了AI应用程序的开发与部署。这种统一视图通过一个捕获应用程序语义的知识图谱和一个支持对数据和模型进行推理的基于规则的查询语言来实现。模型生命周期活动被表示为高级数据流,这些数据流可以在多个平台(例如服务器、云平台或超级计算机)上进行调度。最后,Gypscie记录其所产生制品的溯源信息,从而实现可解释性。我们与代表性AI系统的定性比较表明,Gypscie在AI制品生命周期中支持更广泛的功能。我们的实验评估表明,Gypscie能够根据抽象规范成功地在AI平台上优化和调度数据流。