Reimagining Disassembly Interfaces with Visualization: Combining Instruction Tracing and Control Flow with DisViz

In applications where efficiency is critical, developers may examine their compiled binaries, seeking to understand how the compiler transformed their source code and what performance implications that transformation may have. This analysis is challenging due to the vast number of disassembled binary instructions and the many-to-many mappings between them and the source code. These problems are exacerbated as source code size increases, giving the compiler more freedom to map and disperse binary instructions across the disassembly space. Interfaces for disassembly typically display instructions as an unstructured listing or sacrifice the order of execution. We design a new visual interface for disassembly code that combines execution order with control flow structure, enabling analysts to both trace through code and identify familiar aspects of the computation. Central to our approach is a novel layout of instructions grouped into basic blocks that displays a looping structure in an intuitive way. We add to this disassembly representation a unique block-based mini-map that leverages our layout and shows context across thousands of disassembly instructions. Finally, we embed our disassembly visualization in a web-based tool, DisViz, which adds dynamic linking with source code across the entire application. DizViz was developed in collaboration with program analysis experts following design study methodology and was validated through evaluation sessions with ten participants from four institutions. Participants successfully completed the evaluation tasks, hypothesized about compiler optimizations, and noted the utility of our new disassembly view. Our evaluation suggests that our new integrated view helps application developers in understanding and navigating disassembly code.

翻译：在效率至关重要的应用场景中，开发者常需分析已编译的二进制文件，以理解编译器如何转换源代码及其可能产生的性能影响。由于反汇编后的二进制指令数量庞大，且指令与源代码之间存在多对多的映射关系，此类分析极具挑战性。随着源代码规模的扩大，编译器在反汇编空间中映射与分散二进制指令的自由度更高，使得问题进一步加剧。传统的反汇编界面通常将指令呈现为无序列表，或牺牲执行顺序以换取结构性展示。我们设计了一种新型反汇编代码可视化界面，将执行顺序与控制流结构相结合，使分析人员既能追踪代码执行路径，又能识别计算过程中的熟悉特征。该方法的核心在于一种新颖的基础块指令布局方案，能以直观方式呈现循环结构。我们在此反汇编表示基础上，增加了独特的基于代码块的缩略导航图，该图利用我们的布局算法，可在数千条反汇编指令中展示上下文信息。最后，我们将反汇编可视化系统嵌入基于网络的工具DisViz中，该工具实现了整个应用程序范围内源代码的动态链接。DisViz遵循设计研究方法论，在与程序分析专家协作下开发完成，并通过来自四个机构的十位参与者进行评估验证。参与者成功完成了评估任务，对编译器优化提出了假设，并肯定了新型反汇编视图的实用性。评估结果表明，我们提出的集成视图能有效帮助应用程序开发者理解与导航反汇编代码。