Gypscie: A Cross-Platform AI Artifact Management System

Artificial Intelligence (AI) models, encompassing both traditional machine learning (ML) and more advanced approaches such as deep learning and large language models (LLMs), play a central role in modern applications. AI model lifecycle management involves the end-to-end process of managing these models, from data collection and preparation to model building, evaluation, deployment, and continuous monitoring. This process is inherently complex, as it requires the coordination of diverse services that manage AI artifacts such as datasets, dataflows, and models, all orchestrated to operate seamlessly. In this context, it is essential to isolate applications from the complexity of interacting with heterogeneous services, datasets, and AI platforms. In this paper, we introduce Gypscie, a cross-platform AI artifact management system. By providing a unified view of all AI artifacts, the Gypscie platform simplifies the development and deployment of AI applications. This unified view is realized through a knowledge graph that captures application semantics and a rule-based query language that supports reasoning over data and models. Model lifecycle activities are represented as high-level dataflows that can be scheduled across multiple platforms, such as servers, cloud platforms, or supercomputers. Finally, Gypscie records provenance information about the artifacts it produces, thereby enabling explainability. Our qualitative comparison with representative AI systems shows that Gypscie supports a broader range of functionalities across the AI artifact lifecycle. Our experimental evaluation demonstrates that Gypscie can successfully optimize and schedule dataflows on AI platforms from an abstract specification.

翻译：人工智能（AI）模型，包括传统机器学习（ML）以及深度学习和大语言模型（LLMs）等更先进的方法，在现代应用中扮演着核心角色。AI模型生命周期管理涉及管理这些模型的端到端流程，从数据收集和准备到模型构建、评估、部署和持续监控。该流程本质上是复杂的，因为它需要协调管理数据集、数据流和模型等各种AI制品的多种服务，所有这些服务都被编排起来以无缝运行。在此背景下，将应用程序与处理异构服务、数据集和AI平台的复杂性隔离开来至关重要。本文介绍了Gypscie——一种跨平台的AI制品管理系统。通过提供所有AI制品的统一视图，Gypscie平台简化了AI应用程序的开发与部署。这种统一视图通过一个捕获应用程序语义的知识图谱和一个支持对数据和模型进行推理的基于规则的查询语言来实现。模型生命周期活动被表示为高级数据流，这些数据流可以在多个平台（例如服务器、云平台或超级计算机）上进行调度。最后，Gypscie记录其所产生制品的溯源信息，从而实现可解释性。我们与代表性AI系统的定性比较表明，Gypscie在AI制品生命周期中支持更广泛的功能。我们的实验评估表明，Gypscie能够根据抽象规范成功地在AI平台上优化和调度数据流。

相关内容

关注 7110

人工智能杂志AI(Artificial Intelligence)是目前公认的发表该领域最新研究成果的主要国际论坛。该期刊欢迎有关AI广泛方面的论文，这些论文构成了整个领域的进步，也欢迎介绍人工智能应用的论文，但重点应该放在新的和新颖的人工智能方法如何提高应用领域的性能，而不是介绍传统人工智能方法的另一个应用。关于应用的论文应该描述一个原则性的解决方案，强调其新颖性，并对正在开发的人工智能技术进行深入的评估。官网地址：http://dblp.uni-trier.de/db/journals/ai/

智能体化人工智能 (Agentic AI) 的前行之路：挑战与机遇

专知会员服务

43+阅读 · 1月8日

大模型智能体：概念、前沿和产业实践

专知会员服务

79+阅读 · 2024年8月20日

【COLING2024】从多模态大型语言模型到人类水平的人工智能：模态、指令、推理、效率及超越

专知会员服务

32+阅读 · 2024年5月26日