The field of software verification has produced a wide array of algorithmic techniques that can prove a variety of properties of a given program. It has been demonstrated that the performance of these techniques can vary up to 4 orders of magnitude on the same verification problem. Even for verification experts, it is difficult to decide which tool will perform best on a given problem. For general users, deciding the best tool for their verification problem is effectively impossible. In this work, we present Graves, a selection strategy based on graph neural networks (GNNs). Graves generates a graph representation of a program from which a GNN predicts a score for a verifier that indicates its performance on the program. We evaluate Graves on a set of 10 verification tools and over 8000 verification problems and find that it improves the state-of-the-art in verification algorithm selection by 12%, or 8 percentage points. Further, it is able to verify 9% more problems than any existing verifier on our test set. Through a qualitative study on model interpretability, we find strong evidence that the Graves' model learns to base its predictions on factors that relate to the unique features of the algorithmic techniques.
翻译:软件验证领域已产生了多种算法技术,能够证明给定程序的各种属性。研究表明,这些技术在同一验证问题上的性能差异可达四个数量级。即使是验证专家,也难以判断哪种工具在特定问题上表现最佳。对普通用户而言,为其验证问题选择最佳工具几乎不可能。本文提出了Graves——一种基于图神经网络(GNN)的选择策略。Graves生成程序的图表示,GNN据此预测验证器在程序上的性能得分。我们在10种验证工具和8000多个验证问题上评估了Graves,发现它将验证算法选择的最新技术水平提升了12%(即8个百分点)。此外,该策略在我们的测试集上比任何现有验证器能多验证9%的问题。通过模型可解释性的定性研究,我们获得了有力证据,表明Graves模型学会基于与算法技术独特特征相关的因素进行预测。