PrAIoritize: Automated Early Prediction and Prioritization of Vulnerabilities in Smart Contracts

Context:Smart contracts are prone to numerous security threats due to undisclosed vulnerabilities and code weaknesses. In Ethereum smart contracts, the challenges of timely addressing these code weaknesses highlight the critical need for automated early prediction and prioritization during the code review process. Efficient prioritization is crucial for smart contract security. Objective:Toward this end, our research aims to provide an automated approach, PrAIoritize, for prioritizing and predicting critical code weaknesses in Ethereum smart contracts during the code review process. Method: To do so, we collected smart contract code reviews sourced from Open Source Software (OSS) on GitHub and the Common Vulnerabilities and Exposures (CVE) database. Subsequently, we developed PrAIoritize, an innovative automated prioritization approach. PrAIoritize integrates advanced Large Language Models (LLMs) with sophisticated natural language processing (NLP) techniques. PrAIoritize automates code review labeling by employing a domain-specific lexicon of smart contract weaknesses and their impacts. Following this, feature engineering is conducted for code reviews, and a pre-trained DistilBERT model is utilized for priority classification. Finally, the model is trained and evaluated using code reviews of smart contracts. Results: Our evaluation demonstrates significant improvement over state-of-the-art baselines and commonly used pre-trained models (e.g. T5) for similar classification tasks, with 4.82\%-27.94\% increase in F-measure, precision, and recall. Conclusion: By leveraging PrAIoritize, practitioners can efficiently prioritize smart contract code weaknesses, addressing critical code weaknesses promptly and reducing the time and effort required for manual triage.

翻译：摘要：背景：智能合约因未公开的漏洞和代码缺陷面临着众多安全威胁。在以太坊智能合约中，及时处理这些代码缺陷的挑战凸显了代码审查过程中自动化早期预测与优先级排序的关键需求。高效的优先级排序对于智能合约安全至关重要。目标：为此，本研究旨在提供一种名为PrAIoritize的自动化方法，用于在以太坊智能合约代码审查过程中对关键代码缺陷进行优先级排序和预测。方法：为实现此目标，我们收集了来自GitHub开源软件及通用漏洞披露数据库的智能合约代码审查记录。随后，我们开发了PrAIoritize这一创新性自动化优先级排序方法。PrAIoritize将先进的大语言模型与复杂的自然语言处理技术相结合，通过使用智能合约缺陷及其影响的领域专用词表，实现代码审查标注的自动化。在此基础上，对代码审查进行特征工程，并利用预训练的DistilBERT模型进行优先级分类。最后，使用智能合约代码审查对模型进行训练与评估。结果：评估结果表明，相较于同类分类任务中的最新基线模型及常用预训练模型（如T5），本方法在F值、精确率和召回率上实现了4.82%至27.94%的提升。结论：借助PrAIoritize，从业者可高效地对智能合约代码缺陷进行优先级排序，及时处理关键代码缺陷，并减少人工分类所需的时间与工作量。

相关内容

Automator

关注 5

Automator是苹果公司为他们的Mac OS X系统开发的一款软件。 只要通过点击拖拽鼠标等操作就可以将一系列动作组合成一个工作流，从而帮助你自动的（可重复的）完成一些复杂的工作。Automator还能横跨很多不同种类的程序，包括：查找器、Safari网络浏览器、iCal、地址簿或者其他的一些程序。它还能和一些第三方的程序一起工作，如微软的Office、Adobe公司的Photoshop或者Pixelmator等。

《生成式模型: 变分自编码器与扩散模型》，75页ppt，Google DeepMind科学家Ruiqi Gao

专知会员服务

66+阅读 · 2023年6月10日

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日