Rust is an emerging programming language designed for the development of systems software. To facilitate the reuse of Rust code, crates.io, as a central package registry of the Rust ecosystem, hosts thousands of third-party Rust packages. The openness of crates.io enables the growth of the Rust ecosystem but comes with security risks by severe security advisories. Although Rust guarantees a software program to be safe via programming language features and strict compile-time checking, the unsafe keyword in Rust allows developers to bypass compiler safety checks for certain regions of code. Prior studies empirically investigate the memory safety and concurrency bugs in the Rust ecosystem, as well as the usage of unsafe keywords in practice. Nonetheless, the literature lacks a systematic investigation of the security risks in the Rust ecosystem. In this paper, we perform a comprehensive investigation into the security risks present in the Rust ecosystem, asking ``what are the characteristics of the vulnerabilities, what are the characteristics of the vulnerable packages, and how are the vulnerabilities fixed in practice?''. To facilitate the study, we first compile a dataset of 433 vulnerabilities, 300 vulnerable code repositories, and 218 vulnerability fix commits in the Rust ecosystem, spanning over 7 years. With the dataset, we characterize the types, life spans, and evolution of the disclosed vulnerabilities. We then characterize the popularity, categorization, and vulnerability density of the vulnerable Rust packages, as well as their versions and code regions affected by the disclosed vulnerabilities. Finally, we characterize the complexity of vulnerability fixes and localities of corresponding code changes, and inspect how practitioners fix vulnerabilities in Rust packages with various localities.
翻译:Rust 是一种新兴的、专为系统软件开发而设计的编程语言。为促进 Rust 代码的重用,作为 Rust 生态系统核心包注册中心的 crates.io 托管着数千个第三方 Rust 包。crates.io 的开放性促进了 Rust 生态系统的发展,但同时也带来了严重的安全公告所揭示的安全风险。尽管 Rust 通过编程语言特性和严格的编译时检查来确保软件程序的安全性,但 Rust 中的 `unsafe` 关键字允许开发者在某些代码区域绕过编译器的安全检查。先前的研究通过实证方法调查了 Rust 生态系统中的内存安全与并发错误,以及 `unsafe` 关键字的实际使用情况。然而,现有文献缺乏对 Rust 生态系统中安全风险的系统性研究。本文对 Rust 生态系统中存在的安全风险进行了全面调查,旨在回答:“漏洞具有哪些特征?存在漏洞的包具有哪些特征?漏洞在实践中是如何被修复的?” 为开展这项研究,我们首先构建了一个跨越 7 年、包含 Rust 生态系统中 433 个漏洞、300 个存在漏洞的代码仓库以及 218 个漏洞修复提交的数据集。利用该数据集,我们描述了已披露漏洞的类型、生命周期和演化特征。接着,我们刻画了存在漏洞的 Rust 包的流行度、分类和漏洞密度,以及受披露漏洞影响的包版本和代码区域。最后,我们分析了漏洞修复的复杂性及相应代码变更的位置,并考察了从业人员如何在不同位置上修复 Rust 包中的漏洞。