A rigorous formalization of desired system requirements is indispensable when performing any verification task. This often limits the application of verification techniques, as writing formal specifications is an error-prone and time-consuming manual task. To facilitate this, we present nl2spec, a framework for applying Large Language Models (LLMs) to derive formal specifications (in temporal logics) from unstructured natural language. In particular, we introduce a new methodology to detect and resolve the inherent ambiguity of system requirements in natural language: we utilize LLMs to map subformulas of the formalization back to the corresponding natural language fragments of the input. Users iteratively add, delete, and edit these sub-translations to amend erroneous formalizations, which is easier than manually redrafting the entire formalization. The framework is agnostic to specific application domains and can be extended to similar specification languages and new neural models. We perform a user study to obtain a challenging dataset, which we use to run experiments on the quality of translations. We provide an open-source implementation, including a web-based frontend.
翻译:对系统需求进行严格的形式化是执行任何验证任务时不可或缺的步骤。然而,编写形式化规约通常是一个易出错且耗时的手工任务,这往往限制了验证技术的应用。为此,我们提出nl2spec框架,该框架利用大语言模型从非结构化自然语言中推导出形式化规约(以时序逻辑表示)。特别地,我们引入了一种新方法来检测并解决自然语言中系统需求的固有歧义:利用大语言模型将形式化规约的子公式映射回输入中对应的自然语言片段。用户可通过迭代地添加、删除和编辑这些子翻译来修正错误的形式化结果,这比手动重写整个形式化规约更加简便。该框架与特定应用领域无关,并可扩展至类似的规约语言和新型神经模型。我们通过用户研究获取了一个具有挑战性的数据集,并基于该数据集开展了翻译质量实验。此外,我们提供了包含网页前端在内的开源实现。