Large Language Models (LLMs) increasingly serve as consumers of API specifications, whether for code generation, autonomous agent interaction, or API-assisted reasoning. The de facto standard for API description, OpenAPI, was designed for documentation tools and code generators, resulting in substantial token overhead when used as LLM context. We present LAPIS (Lightweight API Specification for Intelligent Systems), a domain-specific format optimized for LLM consumption that preserves the semantic information necessary for API reasoning while minimizing token usage. Through empirical evaluation against five real-world production API specifications including GitHub (1,080 endpoints), Twilio (197 endpoints), DigitalOcean (545 endpoints), Petstore, and HTTPBin we demonstrate an average token reduction of 85.5% compared to OpenAPI YAML and 88.6% compared to OpenAPI JSON, measured with the cl100k_base tokenizer. LAPIS introduces domain-specific structural innovations, including centralized error definitions, webhook trigger conditions, structured rate limit descriptions, and operation flow declarations information that OpenAPI either duplicates redundantly or cannot represent at all. The format is fully convertible from OpenAPI 3.x via an automated converter, requires no special parser for LLM consumption, and is released as an open specification under CC BY 4.0.
翻译:大型语言模型(LLM)日益成为API规范的使用者,无论是用于代码生成、自主智能体交互还是API辅助推理。当前API描述的事实标准OpenAPI专为文档工具和代码生成器设计,当用作LLM上下文时会产生大量令牌开销。本文提出LAPIS(面向智能系统的轻量级API规范),这是一种专为LLM使用优化的领域特定格式,在保留API推理所需语义信息的同时,最大限度地减少令牌使用。通过对GitHub(1,080个端点)、Twilio(197个端点)、DigitalOcean(545个端点)、Petstore和HTTPBin这五个真实生产环境API规范的实证评估,使用cl100k_base分词器测量显示,相比OpenAPI YAML平均减少85.5%的令牌使用,相比OpenAPI JSON平均减少88.6%。LAPIS引入了领域特定的结构创新,包括集中式错误定义、Webhook触发条件、结构化速率限制描述以及操作流声明——这些信息在OpenAPI中要么冗余重复,要么完全无法表示。该格式可通过自动化转换器从OpenAPI 3.x完全转换,无需为LLM使用配备特殊解析器,并以CC BY 4.0协议作为开放规范发布。