High-level synthesis (HLS) is a widely used tool in designing Field Programmable Gate Array (FPGA). HLS enables FPGA design with software programming languages by compiling the source code into an FPGA circuit. The source code includes a program (called ``kernel'') and several pragmas that instruct hardware synthesis, such as parallelization, pipeline, etc. While it is relatively easy for software developers to design the program, it heavily relies on hardware knowledge to design the pragmas, posing a big challenge for software developers. Recently, different machine learning algorithms, such as GNNs, have been proposed to automate the pragma design via performance prediction. However, when applying the trained model on new kernels, the significant domain shift often leads to unsatisfactory performance. We propose a more domain-generalizable model structure: a two-level hierarchical Mixture of Experts (MoE), that can be flexibly adapted to any GNN model. Different expert networks can learn to deal with different regions in the representation space, and they can utilize similar patterns between the old kernels and new kernels. In the low-level MoE, we apply MoE on three natural granularities of a program: node, basic block, and graph. The high-level MoE learns to aggregate the three granularities for the final decision. To stably train the hierarchical MoE, we further propose a two-stage training method. Extensive experiments verify the effectiveness of the hierarchical MoE.
翻译:高层次综合(HLS)是现场可编程门阵列(FPGA)设计中广泛使用的工具。HLS通过将源代码编译为FPGA电路,使得能够使用软件编程语言进行FPGA设计。源代码包含一个程序(称为“内核”)以及若干指导硬件综合的编译指示(pragma),例如并行化、流水线等。对于软件开发人员而言,设计程序相对容易,但设计编译指示则严重依赖硬件知识,这对软件开发人员构成了巨大挑战。近年来,已有研究提出使用不同的机器学习算法(如图神经网络GNN)通过性能预测来自动化编译指示设计。然而,当将训练好的模型应用于新内核时,显著的领域偏移往往导致性能不尽如人意。我们提出了一种更具领域泛化能力的模型结构:两层层次化专家混合(MoE),该结构可灵活适配至任何GNN模型。不同的专家网络能够学习处理表示空间中的不同区域,并利用旧内核与新内核之间的相似模式。在低层MoE中,我们在程序的三个自然粒度上应用MoE:节点、基本块和图。高层MoE则学习聚合这三个粒度的信息以做出最终决策。为稳定训练层次化MoE,我们进一步提出了一种两阶段训练方法。大量实验验证了层次化MoE的有效性。