C is an unsafe language. Researchers have been developing tools to port C to safer languages such as Rust, Checked C, or Go. Existing tools, however, resort to preprocessing the source file first, then porting the resulting code, leaving barely recognizable code that loses macro abstractions. To preserve macro usage, porting tools need analyses that understand macro behavior to port to equivalent constructs. But macro semantics differ from typical functions, precluding simple syntactic transformations to port them. We introduce the first comprehensive framework for analyzing the portability of macro usage. We decompose macro behavior into 26 fine-grained properties and implement a program analysis tool, called Maki, that identifies them in real-world code with 94% accuracy. We apply Maki to 21 programs containing a total of 86,199 macro definitions. We found that real-world macros are much more portable than previously known. More than a third (37%) are easy-to-port, and Maki provides hints for porting more complicated macros. We find, on average, 2x more easy-to-port macros and up to 7x more in the best case compared to prior work. Guided by Maki's output, we found and hand-ported macros in four real-world programs. We submitted patches to Linux maintainers that transform eleven macros, nine of which have been accepted.
翻译:C 语言是一种不安全的语言。研究人员一直致力于开发工具,将C语言移植到更安全的语言,如Rust、Checked C或Go。然而,现有工具通常先对源文件进行预处理,然后移植生成的代码,导致产生几乎无法识别的代码,并丢失了宏抽象。为了保留宏的使用,移植工具需要能够理解宏行为的分析,以便将其移植到等效的构件。但宏的语义与典型函数不同,排除了通过简单语法转换进行移植的可能性。我们首次提出了一个用于分析宏使用可移植性的综合框架。我们将宏行为分解为26个细粒度属性,并实现了一个名为Maki的程序分析工具,该工具在实际代码中识别这些属性的准确率达到94%。我们将Maki应用于包含总计86,199个宏定义的21个程序。我们发现,实际代码中的宏的可移植性远超以往认知。超过三分之一(37%)的宏易于移植,且Maki为移植更复杂的宏提供了提示。与先前工作相比,我们平均发现的易于移植宏数量是原来的两倍,最佳情况下可达七倍。在Maki输出的指导下,我们为四个实际程序中的宏进行了手动移植。我们向Linux维护者提交了补丁,转化了11个宏,其中9个已被接受。