There has been a growing interest in translating C code to Rust due to Rust's robust memory and thread safety guarantees. Tools such as C2RUST enable syntax-guided transpilation from C to semantically equivalent Rust code. However, the resulting Rust programs often rely heavily on unsafe constructs, particularly raw pointers, which undermines Rust's safety guarantees. This paper aims to improve the memory safety of Rust programs generated by C2RUST by eliminating raw pointers. Specifically, we propose a raw pointer rewriting technique that lifts raw pointers in individual functions to appropriate Rust data structures. Technically, PR2 employs decision-tree-based prompting to guide the pointer lifting process. It also leverages code change analysis to guide the repair of errors introduced during rewriting, effectively addressing errors encountered during compilation and test case execution. We implement PR2 and evaluate it using gpt-4o-mini on 28 real-world C projects. It is shown that PR2 successfully eliminates 18.57% of local raw pointers across these projects, significantly enhancing the safety of the translated Rust code. On average, PR2 completes the transformation of a project in 5.02 hours, at a cost of $1.13.
翻译:近年来,由于Rust语言在内存和线程安全方面具有强大的保障能力,将C语言代码转译为Rust代码的研究日益增多。C2RUST等工具支持通过语法引导将C语言编译为语义等价的Rust代码。然而,生成的Rust程序严重依赖不安全结构(尤其是原始指针),这削弱了Rust的安全保障。本文旨在通过消除原始指针来提升C2RUST生成的Rust程序的内存安全性。具体而言,我们提出了一种原始指针重写技术,将单个函数中的原始指针提升为合适的Rust数据结构。技术上,PR2采用基于决策树的提示方法引导指针提升过程,同时利用代码变更分析指导重写过程中引入错误的修复,有效解决编译与测试用例执行阶段遇到的错误。我们基于PR2并在gpt-4o-mini上对28个真实C语言项目进行了评估。结果表明,PR2成功消除了这些项目中18.57%的局部原始指针,显著增强了转译后Rust代码的安全性。平均而言,PR2在5.02小时内以1.13美元的成本完成单个项目的转换。