A variety of code analyzers, such as IACA, uiCA, llvm-mca or Ithemal, strive to statically predict the throughput of a computation kernel. Each analyzer is based on its own simplified CPU model reasoning at the scale of a basic block. Facing this diversity, evaluating their strengths and weaknesses is important to guide both their usage and their enhancement. We present CesASMe, a fully-tooled solution to evaluate code analyzers on C-level benchmarks composed of a benchmark derivation procedure that feeds an evaluation harness. We conclude that memory-carried data dependencies are a major source of imprecision for these tools. We tackle this issue with staticdeps, a static analyzer extracting memory-carried data dependencies, including across loop iterations, from an assembly basic block. We integrate its output to uiCA, a state-of-the-art code analyzer, to evaluate staticdeps' impact on a code analyzer's precision through CesASMe.
翻译:多种代码分析器(如IACA、uiCA、llvm-mca和Ithemal)致力于静态预测计算内核的吞吐量。每种分析器均基于自身在基本块尺度上进行推理的简化CPU模型。面对这种多样性,评估其优劣对于指导工具使用与改进具有重要意义。我们提出CesASMe——一个全流程工具化解决方案,通过基准程序衍生流程与评估框架的协同,在C级基准测试上评估代码分析器性能。研究表明,内存携带数据依赖是这些工具产生不精确性的主要根源。针对该问题,我们提出静态分析工具staticdeps,能够从汇编基本块中提取内存携带数据依赖(包括跨循环迭代的依赖)。通过CesASMe框架,我们将staticdeps的输出集成至最先进的代码分析器uiCA,以评估staticdeps对代码分析器精确性的影响。