Modern processors increasingly rely on SIMD instruction sets, such as AVX and RVV, to significantly enhance parallelism and computational performance. However, production-ready compilers like LLVM and GCC often fail to fully exploit available vectorization opportunities due to disjoint vectorization passes and limited extensibility. Although recent attempts in heuristics and intermediate representation (IR) designs have attempted to address these problems, efficiently simplifying control flow analysis and accurately identifying vectorization opportunities remain challenging tasks. To address these issues, we introduce a novel vectorization pipeline featuring two specialized IR extensions: SIR, which encodes high-level structural information, and VIR, which explicitly represents instruction dependencies through data dependency analysis. Leveraging the detailed dependency information provided by VIR, we develop a flexible and extensible vectorization framework. This approach substantially improves interoperability across vectorization passes and expands the search space for identifying isomorphic instructions, ultimately enhancing both the scope and efficiency of automatic vectorization. Experimental evaluations demonstrate that our proposed vectorization pipeline achieves significant performance improvements, delivering speedups of up to 53% and 58% compared to LLVM and GCC, respectively.
翻译:现代处理器日益依赖SIMD指令集(如AVX和RVV)来显著提升并行性与计算性能。然而,由于向量化过程相互割裂且可扩展性有限,LLVM和GCC等生产级编译器往往无法充分利用现有的向量化机会。尽管近期在启发式方法与中间表示(IR)设计方面的尝试致力于解决这些问题,但高效简化控制流分析并精准识别向量化机会仍是具有挑战性的任务。为应对这些挑战,我们提出一种新型向量化流程,其包含两种专用IR扩展:SIR(用于编码高层结构信息)与VIR(通过数据依赖分析显式表征指令依赖关系)。借助VIR提供的细粒度依赖信息,我们开发了一个灵活可扩展的向量化框架。该方法显著提升了跨向量化过程的互操作性,并扩展了识别同构指令的搜索空间,最终在自动向量化的覆盖范围与效率方面均实现提升。实验评估表明,我们提出的向量化流程取得了显著的性能改进,相较于LLVM和GCC分别实现了最高53%和58%的加速比。