Fast Summary-based Whole-program Analysis to Identify Unsafe Memory Accesses in Rust

Rust is one of the most promising systems programming languages to fundamentally solve the memory safety issues that have plagued low-level software for over forty years. However, to accommodate the scenarios where Rust's type rules might be too restrictive for certain systems programming and where programmers opt for performance over security checks, Rust opens security escape hatches allowing writing unsafe source code or calling unsafe libraries. Consequently, unsafe Rust code and directly-linked unsafe foreign libraries may not only introduce memory safety violations themselves but also compromise the entire program as they run in the same monolithic address space as the safe Rust. This problem can be mitigated by isolating unsafe memory objects (those accessed by unsafe code) and sandboxing memory accesses to the unsafe memory. One category of prior work utilizes existing program analysis frameworks on LLVM IR to identify unsafe memory objects and accesses. However, they suffer the limitations of prolonged analysis time and low precision. In this paper, we tackled these two challenges using summary-based whole-program analysis on Rust's MIR. The summary-based analysis computes information on demand so as to save analysis time. Performing analysis on Rust's MIR exploits the rich high-level type information inherent to Rust, which is unavailable in LLVM IR. This manuscript is a preliminary study of ongoing research. We have prototyped a whole-program analysis for identifying both unsafe heap allocations and memory accesses to those unsafe heap objects. We reported the overhead and the efficacy of the analysis in this paper.

翻译：Rust是四十年来最有希望从根本上解决困扰底层软件的内存安全问题的系统编程语言之一。然而，为了适应Rust类型规则对某些系统编程可能过于严格以及程序员为性能而牺牲安全检查的场景，Rust提供了安全逃逸口，允许编写不安全的源代码或调用不安全的库。因此，不安全的Rust代码和直接链接的不安全外部库不仅可能自身引入内存安全违规，还可能因与安全Rust运行在同一单地址空间而危及整个程序。通过隔离不安全内存对象（由不安全代码访问的对象）并对不安全内存的内存访问进行沙箱化，可以缓解这一问题。先前有一类工作利用LLVM IR上的现有程序分析框架来识别不安全内存对象和访问，但存在分析时间长和精度低的局限性。本文通过基于Rust MIR上的摘要分析，在全程序范围内应对这两项挑战。摘要分析按需计算信息以节省分析时间，而在Rust MIR上进行分析则利用了Rust固有的、LLVM IR中缺乏的丰富高层类型信息。本文是对正在进行的研究的初步探索。我们原型实现了一种全程序分析方法，用于识别不安全堆分配及对这些不安全堆对象的内存访问。本文报告了该分析的开销和有效性。