Temporal logics are powerful tools that are widely used for the synthesis and verification of reactive systems. The recent progress on Large Language Models (LLMs) has the potential to make the process of writing such specifications more accessible. However, writing specifications in temporal logics remains challenging for all but the most expert users. A key question in using LLMs for temporal logic specification engineering is to understand what kind of guidance is most helpful to the LLM and the users to easily produce specifications. Looking specifically at the problem of reactive program synthesis, we explore the impact of providing an LLM with guidance on the separation of control and data--making explicit for the LLM what functionality is relevant for the specification, and treating the remaining functionality as an implementation detail for a series of pre-defined functions and predicates. We present a benchmark set and find that this separation of concerns improves specification generation. Our benchmark provides a test set against which to verify future work in LLM generation of temporal logic specifications.
翻译:时序逻辑是广泛应用于反应式系统综合与验证的强大工具。大语言模型的最新进展有望使编写此类规约的过程更易于实现。然而,除最专业的用户外,使用时序逻辑编写规约仍具挑战性。在将大语言模型应用于时序逻辑规约工程时,核心问题在于理解何种引导方式最有助于大语言模型和用户轻松生成规约。针对反应式程序综合这一具体问题,我们探索了通过控制与数据分离的引导对大语言模型产生的影响——向大语言模型明确指明规约相关的功能范畴,而将剩余功能视为预定义函数和谓词序列的实现细节。我们构建了一套基准测试集,发现这种关注点分离能有效提升规约生成质量。该基准为验证未来大语言模型时序逻辑生成研究提供了测试标准。