MSWasm: Soundly Enforcing Memory-Safe Execution of Unsafe Code

Alexandra E. Michael,Anitha Gollamudi,Jay Bosamiya,Craig Disselkoen,Aidan Denlinger,Conrad Watt,Bryan Parno,Marco Patrignani,Marco Vassena,Deian Stefan

Most programs compiled to WebAssembly (Wasm) today are written in unsafe languages like C and C++. Unfortunately, memory-unsafe C code remains unsafe when compiled to Wasm -- and attackers can exploit buffer overflows and use-after-frees in Wasm almost as easily as they can on native platforms. Memory-Safe WebAssembly (MSWasm) proposes to extend Wasm with language-level memory-safety abstractions to precisely address this problem. In this paper, we build on the original MSWasm position paper to realize this vision. We give a precise and formal semantics of MSWasm, and prove that well-typed MSWasm programs are, by construction, robustly memory safe. To this end, we develop a novel, language-independent memory-safety property based on colored memory locations and pointers. This property also lets us reason about the security guarantees of a formal C-to-MSWasm compiler -- and prove that it always produces memory-safe programs (and preserves the semantics of safe programs). We use these formal results to then guide several implementations: Two compilers of MSWasm to native code, and a C-to-MSWasm compiler (that extends Clang). Our MSWasm compilers support different enforcement mechanisms, allowing developers to make security-performance trade-offs according to their needs. Our evaluation shows that the overhead of enforcing memory safety in software ranges from 22% (enforcing spatial safety alone) to 198% (enforcing full memory safety) on the PolyBenchC suite. More importantly, MSWasm's design makes it easy to swap between enforcement mechanisms; as fast (especially hardware-based) enforcement techniques become available, MSWasm will be able to take advantage of these advances almost for free.

翻译：如今，编译为WebAssembly（Wasm）的多数程序由C和C++等不安全语言编写。遗憾的是，内存不安全的C代码在编译为Wasm后仍保持其不安全性——攻击者可以像在本地平台上一样轻松地利用Wasm中的缓冲区溢出和释放后使用漏洞。内存安全WebAssembly（MSWasm）提出通过语言级内存安全抽象扩展Wasm，以精准解决此问题。本文在原始MSWasm立场论文的基础上实现这一愿景。我们给出了MSWasm精确且形式化的语义，并证明了类型正确的MSWasm程序在构造上具有鲁棒的内存安全性。为此，我们基于彩色内存位置与指针开发了一种新颖的、语言无关的内存安全属性。该属性还能让我们推理形式化C-to-MSWasm编译器的安全保证——并证明其始终生成内存安全的程序（同时保留安全程序的语义）。我们利用这些形式化结果指导多项实现：两个将MSWasm编译为本地代码的编译器，以及一个扩展了Clang的C-to-MSWasm编译器。我们的MSWasm编译器支持不同的强制机制，使开发者能够根据需求在安全性与性能之间进行权衡。评估表明，在PolyBenchC基准测试套件上，软件中强制内存安全带来的开销从22%（仅强制空间安全）到198%（强制完整内存安全）不等。更重要的是，MSWasm的设计使得在不同强制机制间切换变得容易；随着快速（尤其是基于硬件的）强制技术变得可用，MSWasm几乎能零成本地利用这些进步。