Writing declarative models has numerous benefits, ranging from automated reasoning and correction of design-level properties before systems are built, to automated testing and debugging of their implementations after they are built. Alloy is a declarative modeling language that is well-suited for verifying system designs. A key strength of Alloy is its scenario-finding toolset, the Analyzer, which allows users to explore all valid scenarios that adhere to the model's constraints up to a user-provided scope. However, even with visualized scenarios, it is difficult to write correct Alloy models. To address this, a growing body of work explores different techniques for debugging Alloy models. In order to develop and evaluate these techniques in an effective manor, this paper presents an empirical study of over 97,000 models written by novice users trying to learn Alloy. We investigate how users write both correct and incorrect models in order to produce a comprehensive benchmark for future use as well as a series of observations to guide debugging and educational efforts for Alloy model development.
翻译:编写声明式模型具有诸多益处,从在系统构建前对设计层面属性进行自动化推理与修正,到构建后对其实现进行自动化测试与调试。Alloy是一种适用于验证系统设计的声明式建模语言,其核心优势在于场景查找工具集——分析器(Analyzer),该工具允许用户在用户指定的范围内探索所有符合模型约束的有效场景。然而,即便借助可视化场景,编写正确的Alloy模型仍颇具挑战。为此,学界不断探索调试Alloy模型的不同技术。为有效开发与评估这些技术,本文对97,000余个由初学者编写的Alloy模型开展了实证研究。我们探究了用户编写正确与错误模型的方式,旨在构建一个可供未来使用的综合性基准测试集,并形成一系列观察结论以指导Alloy模型开发的调试与教育工作。