Large Language Model Supply Chain: A Research Agenda

The rapid advancements in pre-trained Large Language Models (LLMs) and Large Multimodal Models (LMMs) have ushered in a new era of intelligent applications, transforming fields ranging from natural language processing to content generation. The LLM supply chain represents a crucial aspect of the contemporary artificial intelligence landscape. It encompasses the entire lifecycle of pre-trained models, from its initial development and training to its final deployment and application in various domains. This paper presents a comprehensive overview of the LLM supply chain, highlighting its three core elements: 1) the model infrastructure, encompassing datasets and toolchain for training, optimization, and deployment; 2) the model lifecycle, covering training, testing, releasing, and ongoing maintenance; and 3) the downstream application ecosystem, enabling the integration of pre-trained models into a wide range of intelligent applications. However, this rapidly evolving field faces numerous challenges across these key components, including data privacy and security, model interpretability and fairness, infrastructure scalability, and regulatory compliance. Addressing these challenges is essential for harnessing the full potential of LLMs and ensuring their ethical and responsible use. This paper provides a future research agenda for the LLM supply chain, aiming at driving the continued advancement and responsible deployment of these transformative LLMs.

翻译：预训练大型语言模型（LLMs）与大型多模态模型（LMMs）的快速发展，开启了智能应用的新纪元，推动了从自然语言处理到内容生成等多个领域的变革。LLM供应链作为当代人工智能生态系统中的关键环节，涵盖了预训练模型从初始开发训练到最终部署应用的全生命周期。本文对LLM供应链进行了全面综述，重点阐述其三大核心要素：1）模型基础设施，涵盖用于训练、优化与部署的数据集及工具链；2）模型生命周期，包括训练、测试、发布及持续维护；3）下游应用生态系统，支持将预训练模型集成至各类智能应用中。然而，这一快速发展的领域在数据隐私与安全、模型可解释性与公平性、基础设施可扩展性及合规监管等关键环节仍面临诸多挑战。解决这些问题对于充分发挥LLMs的潜力、确保其伦理与负责任使用至关重要。本文提出了LLM供应链的未来研究议程，旨在推动这些变革性大模型的持续进步与负责任部署。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【CVPR 2022】基于元内存传输的跨域少镜头语义分割，Remember the Difference: Cross-Domain Few-Shot Semantic Segmentation via Meta-Memory Transfer

专知会员服务

13+阅读 · 2022年3月12日

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日