Fast Summary-based Whole-program Analysis to Identify Unsafe Memory Accesses in Rust

Rust is one of the most promising systems programming languages to fundamentally solve the memory safety issues that have plagued low-level software for over forty years. However, to accommodate the scenarios where Rust's type rules might be too restrictive for certain systems programming and where programmers opt for performance over security checks, Rust opens security escape hatches allowing writing unsafe source code or calling unsafe libraries. Consequently, unsafe Rust code and directly-linked unsafe foreign libraries may not only introduce memory safety violations themselves but also compromise the entire program as they run in the same monolithic address space as the safe Rust. This problem can be mitigated by isolating unsafe memory objects (those accessed by unsafe code) and sandboxing memory accesses to the unsafe memory. One category of prior work utilizes existing program analysis frameworks on LLVM IR to identify unsafe memory objects and accesses. However, they suffer the limitations of prolonged analysis time and low precision. In this paper, we tackled these two challenges using summary-based whole-program analysis on Rust's MIR. The summary-based analysis computes information on demand so as to save analysis time. Performing analysis on Rust's MIR exploits the rich high-level type information inherent to Rust, which is unavailable in LLVM IR. This manuscript is a preliminary study of ongoing research. We have prototyped a whole-program analysis for identifying both unsafe heap allocations and memory accesses to those unsafe heap objects. We reported the overhead and the efficacy of the analysis in this paper.

翻译：Rust是最有希望从根本上解决困扰低级软件四十余年内存安全问题的系统编程语言之一。然而，为了适应某些系统编程场景中Rust的类型规则可能过于严格，以及程序员倾向于选择性能而非安全检查的情况，Rust提供了安全逃逸机制，允许编写不安全的源代码或调用不安全的库。因此，不安全的Rust代码及直接链接的不安全外部库不仅可能自身引入内存安全违规，还会因与安全的Rust代码运行在同一单一地址空间中而危及整个程序。通过隔离不安全内存对象（由不安全代码访问的对象）并对不安全内存的内存访问进行沙箱化可缓解此问题。先前有一类工作利用LLVM IR上的现有程序分析框架来识别不安全内存对象和访问，但存在分析耗时长、精度低的局限。本文采用基于Rust MIR的摘要式全程序分析应对这两个挑战。摘要分析按需计算信息以节省分析时间，在Rust MIR上进行分析充分利用了Rust固有的丰富高级类型信息（LLVM IR中缺失该信息）。本文是正在进行研究的初步成果，我们已原型实现全程序分析，用于识别不安全堆分配及针对这些不安全堆对象的内存访问，并报告了该分析的开销与有效性。