Cybersecurity attacks on embedded devices for industrial control systems and cyber-physical systems may cause catastrophic physical damage as well as economic loss. This could be achieved by infecting device binaries with malware that modifies the physical characteristics of the system operation. Mitigating such attacks benefits from reverse engineering tools that recover sufficient semantic knowledge in terms of mathematical equations of the implemented algorithm. Conventional reverse engineering tools can decompile binaries to low-level code, but offer little semantic insight. This paper proposes the REMaQE automated framework for reverse engineering of math equations from binary executables. Improving over state-of-the-art, REMaQE handles equation parameters accessed via registers, the stack, global memory, or pointers, and can reverse engineer object-oriented implementations such as C++ classes. Using REMaQE, we discovered a bug in the Linux kernel thermal monitoring tool "tmon". To evaluate REMaQE, we generate a dataset of 25,096 binaries with math equations implemented in C and Simulink. REMaQE successfully recovers a semantically matching equation for all 25,096 binaries. REMaQE executes in 0.48 seconds on average and in up to 2 seconds for complex equations. Real-time execution enables integration in an interactive math-oriented reverse engineering workflow.
翻译:针对工业控制系统和网络物理系统中嵌入式设备的网络安全攻击可能造成灾难性的物理损害以及经济损失。攻击者可通过用恶意软件感染设备二进制文件来实现,从而修改系统运行的物理特性。缓解此类攻击需要借助逆向工程工具,从数学方程层面恢复实现算法的充分语义知识。传统逆向工程工具能将二进制文件反编译为低级代码,但几乎不提供语义洞察。本文提出REMaQE自动化框架,用于从二进制可执行文件中逆向工程数学方程。相比现有技术,REMaQE能处理通过寄存器、栈、全局内存或指针访问的方程参数,并可逆向工程面向对象实现(如C++类)。利用REMaQE,我们发现了Linux内核温度监控工具"tmon"中的一个bug。为评估REMaQE,我们生成了包含25,096个二进制文件的数据集,这些文件实现了C语言和Simulink中的数学方程。REMaQE成功为全部25,096个二进制文件恢复了语义匹配的方程。平均执行时间为0.48秒,复杂方程最多需2秒。实时执行能力使其可集成到交互式数学导向的逆向工程工作流中。