Specifications: The missing link to making the development of LLM systems an engineering discipline

Ion Stoica,Matei Zaharia,Joseph Gonzalez,Ken Goldberg,Koushik Sen,Hao Zhang,Anastasios Angelopoulos,Shishir G. Patil,Lingjiao Chen,Wei-Lin Chiang,Jared Q. Davis

Despite the significant strides made by generative AI in just a few short years, its future progress is constrained by the challenge of building modular and robust systems. This capability has been a cornerstone of past technological revolutions, which relied on combining components to create increasingly sophisticated and reliable systems. Cars, airplanes, computers, and software consist of components-such as engines, wheels, CPUs, and libraries-that can be assembled, debugged, and replaced. A key tool for building such reliable and modular systems is specification: the precise description of the expected behavior, inputs, and outputs of each component. However, the generality of LLMs and the inherent ambiguity of natural language make defining specifications for LLM-based components (e.g., agents) both a challenging and urgent problem. In this paper, we discuss the progress the field has made so far-through advances like structured outputs, process supervision, and test-time compute-and outline several future directions for research to enable the development of modular and reliable LLM-based systems through improved specifications.

翻译：尽管生成式人工智能在短短几年内取得了显著进展，但其未来发展仍受制于构建模块化与鲁棒性系统的挑战。这种能力历来是技术革命的基石——通过组合组件来构建日益复杂可靠的系统。汽车、飞机、计算机及软件皆由引擎、车轮、CPU、程序库等可组装、调试与替换的组件构成。构建此类可靠模块化系统的关键工具在于规范：对每个组件预期行为、输入与输出的精确描述。然而，大语言模型的通用性与自然语言固有的模糊性，使得为基于LLM的组件（如智能体）定义规范成为既具挑战性又亟待解决的问题。本文探讨了该领域迄今取得的进展——通过结构化输出、过程监督及测试时计算等突破——并展望了若干未来研究方向，旨在通过改进规范实现基于LLM的模块化可靠系统开发。

相关内容

Engineering

关注 6

《工程》是中国工程院（CAE）于2015年推出的国际开放存取期刊。其目的是提供一个高水平的平台，传播和分享工程研发的前沿进展、当前主要研究成果和关键成果；报告工程科学的进展，讨论工程发展的热点、兴趣领域、挑战和前景，在工程中考虑人与环境的福祉和伦理道德，鼓励具有深远经济和社会意义的工程突破和创新，使之达到国际先进水平，成为新的生产力，从而改变世界，造福人类，创造新的未来。期刊链接：https://www.sciencedirect.com/journal/engineering

【CVPR 2022】一种无需使用负样本的自监督学习方法，Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes

专知会员服务

15+阅读 · 2022年3月12日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

生成性对抗网络:理论模型、评估指标和最近发展的概述，Generative Adversarial Networks (GANs): An Overview of Theoretical Model, Evaluation Metrics, and Recent Developments

专知会员服务

42+阅读 · 2020年5月30日

【跨语言BERT模型大集合】Transfer learning is increasingly going multilingual with language-specific BERT models

专知会员服务

54+阅读 · 2020年1月30日