Recently, computers have diversified architectures. To achieve high numerical calculation software performance, it is necessary to tune the software according to the target computer architecture. However, code optimization for each environment is difficult unless it is performed by a specialist who knows computer architectures well. By applying autotuning (AT), the tuning effort can be reduced. Optimized implementation by AT that enhances computer performance can be used even by non-experts. In this research, we propose a technique for AT for programs using open multi-processing (OpenMP). We propose an AT method using an AT language that changes the OpenMP optimized loop and dynamically changes the number of threads in OpenMP according to computational kernels. Performance evaluation was performed using the Fujitsu PRIMEHPC FX100, which is a K-computer type supercomputer installed at the Information Technology Center, Nagoya University. As a result, we found there was a performance increase of 1.801 times that of the original code in a plasma turbulence analysis.
翻译:近年来,计算机的架构日趋多样化。为了实现高性能数值计算软件,需要根据目标计算机的架构对软件进行调优。然而,若非精通计算机架构的专业人员,针对不同环境进行代码优化十分困难。通过应用自动调优(AT)技术,可以降低调优工作难度。即便非专业人员也能使用通过AT实现性能提升的优化实现方案。本研究针对使用开放多处理(OpenMP)的程序,提出了一种自动调优技术。我们提出了一种基于AT语言的调优方法,该方法可动态调整OpenMP优化循环中的指令组合,并根据计算核心动态改变OpenMP线程数。性能评估在名古屋大学信息技术中心部署的K计算机型超级计算机富士通PRIMEHPC FX100上进行。结果表明,在等离子体湍流分析场景中,该方法实现了相较原始代码1.801倍的性能提升。