The concept of ownership in high level languages can aid both the programmer and the compiler to reason about the validity of memory operations. Previously, ownership semantics has been used successfully in high level automatic program verification to model a reference to data by a first order logic (FOL) representation of data instead of maintaining an address map. However, ownership semantics is not used in low level program verification. We have identified two challenges. First, ownership information is lost when a program is compiled to a low level intermediate representation (e.g., in LLVM IR). Second, pointers in low level programs point to bytes using an address map (e.g., in unsafe Rust) and thus the verification condition (VC) cannot always replace a pointer by its FOL abstraction. To remedy the situation, we develop ownership semantics for an LLVM like low level intermediate representation. Using these semantics, the VC can opportunistically model some memory accesses by a direct access of a pointer cache that stores byte representation of data. This scheme reduces instances where an address map must be maintained, especially for mostly safe programs that follow ownership semantics. For unsafe functionality, memory accesses are modelled by operations on an address map and we provide mechanisms to keep the address map and pointer cache in sync. We implement these semantics in SEABMC, a bit precise bounded model checker for LLVM. For evaluation, the source programs are assumed to be written in C. Since C does not have ownership built in, suitable macros are added that introduce and preserve ownership during translation to LLVM like IR for verification. This approach is evaluated on mature open source C code. For both handcrafted benchmarks and practical programs, we observe a speedup of $1.3x-5x$ during SMT solving.
翻译:高级语言中的所有权概念可协助程序员和编译器推理内存操作的有效性。先前,所有权语义已成功应用于高级自动程序验证中,通过数据的一阶逻辑表示而非维护地址映射来建模对数据的引用。然而,所有权语义在低级程序验证中尚未得到应用。我们识别出两大挑战:首先,当程序被编译至低级中间表示时,所有权信息会丢失;其次,低级程序中的指针通过地址映射指向字节,因此验证条件无法始终用一阶逻辑抽象替代指针。为改善此状况,我们为类LLVM低级中间表示开发了所有权语义。利用这些语义,验证条件可机会性地将部分内存访问建模为直接访问存储数据字节表示的指针缓存。该方案减少了必须维护地址映射的情况,尤其适用于遵循所有权语义的多数安全程序。对于不安全功能,内存访问通过地址映射操作进行建模,我们提供了保持地址映射与指针缓存同步的机制。我们在SEABMC中实现了这些语义。为进行评估,假设源程序用C语言编写。由于C语言未内置所有权,我们添加了合适的宏,在翻译至类LLVM中间表示进行验证时引入并保持所有权。该方法在成熟的C语言开源代码上进行了评估。无论是人工构造的基准测试还是实际程序,在SMT求解阶段均观察到$1.3x-5x$的加速。