The surge in the adoption of smart contracts necessitates rigorous auditing to ensure their security and reliability. Manual auditing, although comprehensive, is time-consuming and heavily reliant on the auditor's expertise. With the rise of Large Language Models (LLMs), there is growing interest in leveraging them to assist auditors in the auditing process (co-auditing). However, the effectiveness of LLMs in smart contract co-auditing is contingent upon the design of the input prompts, especially in terms of context description and code length. This paper introduces a novel context-driven prompting technique for smart contract co-auditing. Our approach employs three techniques for context scoping and augmentation, encompassing code scoping to chunk long code into self-contained code segments based on code inter-dependencies, assessment scoping to enhance context description based on the target assessment goal, thereby limiting the search space, and reporting scoping to force a specific format for the generated response. Through empirical evaluations on publicly available vulnerable contracts, our method demonstrated a detection rate of 96\% for vulnerable functions, outperforming the native prompting approach, which detected only 53\%. To assess the reliability of our prompting approach, manual analysis of the results was conducted by expert auditors from our partner, Quantstamp, a world-leading smart contract auditing company. The experts' analysis indicates that, in unlabeled datasets, our proposed approach enhances the proficiency of the GPT-4 code interpreter in detecting vulnerabilities.
翻译:智能合约的广泛采用亟需严格的审计来确保其安全性与可靠性。人工审计虽然全面,但耗时且高度依赖审计员的专业知识。随着大语言模型(LLMs)的兴起,利用其辅助审计员进行审计过程(协同审计)的兴趣日益增长。然而,LLMs在智能合约协同审计中的有效性取决于输入提示的设计,尤其是在上下文描述和代码长度方面。本文提出了一种新颖的上下文驱动提示技术,用于智能合约协同审计。我们的方法采用三种上下文界定与增强技术,包括:基于代码相互依赖性将长代码切分为独立代码段的代码界定;根据目标评估目标增强上下文描述从而限制搜索空间的评估界定;以及强制生成响应遵循特定格式的报告界定。通过对公开漏洞合约的实证评估,我们的方法在漏洞函数检测率达到96%,优于仅检测出53%的原始提示方法。为评估我们提示方法的可靠性,合作方Quantstamp(世界领先的智能合约审计公司)的专家审计员对结果进行了人工分析。专家分析表明,在未标注数据集中,我们提出的方法提升了GPT-4代码解释器在漏洞检测方面的能力。