Smart contracts are susceptible to various security issues, among which access control (AC) vulnerabilities are particularly critical. While existing research has proposed multiple detection tools, the automatic and appropriate repair of AC vulnerabilities in smart contracts remains a challenge. Unlike commonly supported vulnerability types by existing repair tools, such as reentrancy, which are usually fixed by template-based approaches, the main obstacle of AC lies in identifying the appropriate roles or permissions amid a long list of non-AC-related source code to generate proper patch code, a task that demands human-level intelligence. Leveraging recent advancements in large language models (LLMs), we employ the state-of-the-art GPT-4 model and enhance it with a novel approach called ACFIX. The key insight is that we can mine common AC practices for major categories of code functionality and use them to guide LLMs in fixing code with similar functionality. To this end, ACFIX involves both offline and online phases. First, during the offline phase, ACFIX mines a taxonomy of common Role-based Access Control (RBAC) practices from 344,251 on-chain contracts, categorizing 49 role-permission pairs from the top 1,000 pairs mined. Second, during the online phase, ACFIX tracks AC-related elements across the contract and uses this context information along with a Chain-of-Thought pipeline to guide LLMs in identifying the most appropriate role-permission pair for the subject contract and subsequently generating a suitable patch. This patch will then undergo a validity and effectiveness check. To evaluate ACFIX, we built the first benchmark dataset of 118 real-world AC vulnerabilities, and our evaluation revealed that ACFIX successfully repaired 94.92% of them. This represents a significant improvement compared to the baseline GPT-4, which achieved only 52.54%.
翻译:智能合约易受多种安全问题影响,其中访问控制(AC)漏洞尤为关键。尽管现有研究已提出多种检测工具,但如何自动且恰当地修复智能合约中的AC漏洞仍是难题。与现有修复工具(如重入漏洞)通常采用模板化方法不同,AC修复的主要障碍在于需从大量非AC相关源代码中识别合适的角色或权限以生成正确的补丁代码——这需要人类级别的智能。借助大语言模型(LLM)的最新进展,我们采用最先进的GPT-4模型,并引入一种名为ACFIX的创新方法对其进行增强。其核心洞察在于:可通过挖掘主流代码功能类别的通用AC实践,指导LLM修复具有相似功能的代码。为此,ACFIX包含离线和在线两个阶段。首先,在离线阶段,ACFIX从344,251个链上合约中挖掘常见基于角色的访问控制(RBAC)实践分类体系,从挖掘出的前1,000个角色-权限对中归纳出49类典型配对。其次,在线阶段中,ACFIX追踪合约中的AC相关元素,并利用上下文信息结合思维链(Chain-of-Thought)流水线,引导LLM为目标合约识别最合适的角色-权限对,进而生成恰当的补丁。该补丁随后将通过有效性与正确性验证。为评估ACFIX,我们构建了首个包含118个真实世界AC漏洞的基准数据集。实验表明,ACFIX成功修复了其中94.92%的漏洞,相较仅达到52.54%修复率的基线GPT-4实现了显著提升。