Advances in Artificial Intelligence (AI) have led AI for Theorem Proving to become a promising means of formally verifying computer systems. Whilst formal verification is traditionally reserved for safety-critical systems due to the required amount of expertise and effort, AI can help to automate a large amount of this workload and make it far more accessible. Blockchain-based systems are becoming increasingly popular and are frequently targeted by malicious actors, often resulting in huge financial losses, highlighting the need to better verify these systems and mitigate vulnerabilities. Arguably the most important component of these systems is the consensus protocol, which allows nodes to agree on decisions in a potentially adversarial environment. In this paper, we improve upon IsabeLLM, the automated theorem proving tool in Isabelle. Namely, we implement a Retrieval-Augmented Generation framework, Error tracing and counterexample generation for improved context supplied to the Large Language Model. Compatibility with the latest version of Isabelle and Sledgehammer is also implemented for improved efficiency. We compare the performance of the two versions of IsabeLLM in their ability to complete the verification of Bitcoin's Proof of Work consensus.
翻译:人工智能的进步使面向定理证明的AI成为形式化验证计算机系统的有效手段。传统上,由于形式化验证所需的专业知识和投入巨大,其应用主要局限于安全关键系统,而AI能够自动化处理大量工作并显著降低使用门槛。基于区块链的系统日益普及,却频繁遭受恶意攻击,导致巨额经济损失,凸显了加强系统验证与漏洞缓解的必要性。共识协议作为此类系统最关键的组成部分,使节点能在潜在对抗环境中达成一致决策。本文对IsabeLLM——基于Isabelle的自动化定理证明工具进行了改进:具体实现了检索增强生成框架、错误追踪与反例生成机制,以提升提供给大语言模型的上下文质量;同时适配最新版Isabelle和Sledgehammer以提高效率。我们通过比较两个版本IsabeLLM完成比特币工作量证明共识验证的能力,评估了其性能差异。