Predicting program behavior and reasoning about code execution remain significant challenges in software engineering, particularly for large language models (LLMs) designed for code analysis. While these models excel at understanding static syntax, they often struggle with dynamic reasoning tasks. We introduce Visual Coder, a simple yet effective approach that enhances code reasoning by integrating multimodal Chain-of-Thought (CoT) reasoning with a visual Control Flow Graph (CFG). By aligning code snippets with their corresponding CFGs, Visual Coder provides deeper insights into execution flow, enabling more accurate predictions of code behavior. Our experiments demonstrate that augmenting LLMs with visual CFGs significantly outperforms text-based CFG descriptions in code reasoning tasks. We address challenges in multimodal CoT integration through a reference mechanism, ensuring consistency between code and its execution path, thereby improving performance in program behavior prediction, error detection, and output generation.
翻译:预测程序行为与推理代码执行过程仍然是软件工程领域的重大挑战,尤其对于专为代码分析设计的大语言模型(LLMs)而言。尽管这些模型在理解静态语法方面表现出色,但在动态推理任务上往往存在困难。本文提出Visual Coder——一种通过将多模态思维链(CoT)推理与可视化控制流图(CFG)相结合来增强代码推理能力的简洁而有效的方法。通过将代码片段与其对应的CFG对齐,Visual Coder能够深入揭示执行流程,从而更准确地预测代码行为。我们的实验表明,在代码推理任务中,为LLMs增强可视化CFG显著优于基于文本的CFG描述。我们通过引用机制解决了多模态CoT整合中的挑战,确保代码与其执行路径之间的一致性,从而在程序行为预测、错误检测和输出生成方面提升了性能。