You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models

RESTful APIs are popular web services, requiring documentation to ease their comprehension, reusability and testing practices. The OpenAPI Specification (OAS) is a widely adopted and machine-readable format used to document such APIs. However, manually documenting RESTful APIs is a time-consuming and error-prone task, resulting in unavailable, incomplete, or imprecise documentation. As RESTful API testing tools require an OpenAPI specification as input, insufficient or informal documentation hampers testing quality. Recently, Large Language Models (LLMs) have demonstrated exceptional abilities to automate tasks based on their colossal training data. Accordingly, such capabilities could be utilized to assist the documentation and testing process of RESTful APIs. In this paper, we present RESTSpecIT, the first automated RESTful API specification inference and black-box testing approach leveraging LLMs. The approach requires minimal user input compared to state-of-the-art RESTful API inference and testing tools; Given an API name and an LLM key, HTTP requests are generated and mutated with data returned by the LLM. By sending the requests to the API endpoint, HTTP responses can be analyzed for inference and testing purposes. RESTSpecIT utilizes an in-context prompt masking strategy, requiring no model fine-tuning. Our evaluation demonstrates that RESTSpecIT is capable of: (1) inferring specifications with 85.05% of GET routes and 81.05% of query parameters found on average, (2) discovering undocumented and valid routes and parameters, and (3) uncovering server errors in RESTful APIs. Inferred specifications can also be used as testing tool inputs.

翻译：RESTful API是流行的网络服务，需要文档来帮助理解、复用和测试。OpenAPI规范（OAS）是一种广泛采用且机器可读的格式，用于记录此类API。然而，手动编写RESTful API文档耗时且易出错，导致文档不可用、不完整或不精确。由于RESTful API测试工具需要OpenAPI规范作为输入，文档不足或不规范会阻碍测试质量。近年来，大型语言模型（LLM）凭借其庞大的训练数据在自动化任务中展现出卓越能力。因此，这些能力可用于辅助RESTful API的文档编写和测试过程。本文提出RESTSpecIT，这是首个利用LLM的自动化RESTful API规范推断与黑盒测试方法。与现有最先进的RESTful API推断和测试工具相比，该方法所需用户输入最少：只需提供API名称和LLM密钥，即可生成HTTP请求，并利用LLM返回的数据进行变异。通过向API端点发送请求，可分析HTTP响应以进行推断和测试。RESTSpecIT采用上下文提示掩码策略，无需模型微调。评估表明，RESTSpecIT能够：（1）推断规范，平均找到85.05%的GET路径和81.05%的查询参数；（2）发现未记录的有效路径和参数；（3）暴露RESTful API中的服务器错误。推断出的规范也可用作测试工具输入。