MIREncoder: Multi-modal IR-based Pretrained Embeddings for Performance Optimizations

One of the primary areas of interest in High Performance Computing is the improvement of performance of parallel workloads. Nowadays, compilable source code-based optimization tasks that employ deep learning often exploit LLVM Intermediate Representations (IRs) for extracting features from source code. Most such works target specific tasks, or are designed with a pre-defined set of heuristics. So far, pre-trained models are rare in this domain, but the possibilities have been widely discussed. Especially approaches mimicking large-language models (LLMs) have been proposed. But these have prohibitively large training costs. In this paper, we propose MIREncoder, a M}ulti-modal IR-based Auto-Encoder that can be pre-trained to generate a learned embedding space to be used for downstream tasks by machine learning-based approaches. A multi-modal approach enables us to better extract features from compilable programs. It allows us to better model code syntax, semantics and structure. For code-based performance optimizations, these features are very important while making optimization decisions. A pre-trained model/embedding implicitly enables the usage of transfer learning, and helps move away from task-specific trained models. Additionally, a pre-trained model used for downstream performance optimization should itself have reduced overhead, and be easily usable. These considerations have led us to propose a modeling approach that i) understands code semantics and structure, ii) enables use of transfer learning, and iii) is small and simple enough to be easily re-purposed or reused even with low resource availability. Our evaluations will show that our proposed approach can outperform the state of the art while reducing overhead.

翻译：在高性能计算领域，提升并行工作负载的性能是主要研究方向之一。当前，采用深度学习的可编译源代码优化任务通常利用LLVM中间表示从源代码中提取特征。大多数此类工作针对特定任务，或基于预定义的启发式规则集设计。迄今为止，该领域预训练模型较为罕见，但其可能性已被广泛探讨。特别是模仿大语言模型的方法已被提出，但这些方法存在极高的训练成本。本文提出MIREncoder，一种基于多模态中间表示的自编码器，可通过预训练生成用于机器学习下游任务的嵌入空间。多模态方法使我们能够更好地从可编译程序中提取特征，从而更有效地建模代码语法、语义和结构。对于基于代码的性能优化，这些特征在制定优化决策时至关重要。预训练模型/嵌入隐式支持迁移学习，有助于摆脱任务特定训练模型的限制。此外，用于下游性能优化的预训练模型本身应具有较低开销且易于使用。基于这些考量，我们提出一种建模方法：1) 理解代码语义与结构；2) 支持迁移学习；3) 模型轻量简洁，即使在低资源条件下也能轻松复用。评估结果表明，所提方法在降低开销的同时能够超越现有最优方法。