When utilized effectively, Supercloud heterogeneous systems have the potential to significantly enhance performance. Our ReDSEa tool-chain automates the mapping, load balancing, scheduling, parallelism, and overlapping processes for the Triangular System Solver (TS) on a heterogeneous system consisting of a Huawei Kunpeng ARM multi-core CPU and an Ascend 910 AI HW accelerator. We propose an LLVM compiler tool-chain that a) leverages compiler analysis and b) utilizes novel performance models exploring recursive, iterative, and blocked computation models. Our tool-chain facilitates a speedup of up to 16x compared to an optimized 48-core CPU-only implementation.
翻译:当有效利用时,超级云异构系统具有显著提升性能的潜力。我们的ReDSEa工具链可在由华为鲲鹏ARM多核CPU和昇腾910 AI硬件加速器组成的异构系统上,自动完成三角系统求解器(TS)的映射、负载均衡、调度、并行化和重叠处理。我们提出了一套LLVM编译器工具链,该工具链a)利用编译器分析,b)采用探索递归、迭代和分块计算模型的新型性能模型。与优化的48核纯CPU实现相比,我们的工具链可实现高达16倍的加速。