Specifications are vital for ensuring program correctness, yet writing them manually remains challenging and time-intensive. Recent large language model (LLM)-based methods have shown successes in generating specifications such as postconditions, but existing single-pass prompting often yields inaccurate results. In this paper, we present SpecMind, a novel framework for postcondition generation that treats LLMs as interactive and exploratory reasoners rather than one-shot generators. SpecMind employs feedback-driven multi-turn prompting approaches, enabling the model to iteratively refine candidate postconditions by incorporating implicit and explicit correctness feedback, while autonomously deciding when to stop. This process fosters deeper code comprehension and improves alignment with true program behavior via exploratory attempts. Our empirical evaluation shows that SpecMind significantly outperforms state-of-the-art approaches in both accuracy and completeness of generated postconditions.
翻译:规范对于确保程序正确性至关重要,但手动编写规范仍然具有挑战性且耗时费力。近期基于大语言模型的方法在生成后置条件等规范方面已取得一定成功,但现有的单次提示方法往往产生不准确的结果。本文提出SpecMind,一种新颖的后置条件生成框架,它将LLM视为交互式、探索性的推理器,而非一次性生成器。SpecMind采用反馈驱动的多轮提示方法,使模型能够通过整合隐式和显式正确性反馈,迭代优化候选后置条件,并自主决定何时停止。这一过程通过探索性尝试促进更深入的代码理解,并提升与真实程序行为的对齐度。我们的实证评估表明,SpecMind在生成后置条件的准确性和完整性方面均显著优于现有最先进方法。