Mixed-precision computing has the potential to significantly reduce the cost of exascale computations, but determining when and how to implement it in programs can be challenging. In this article, we propose a methodology for enabling mixed-precision with the help of computer arithmetic tools, roofline model, and computer arithmetic techniques. As case studies, we consider Nekbone, a mini-application for the Computational Fluid Dynamics (CFD) solver Nek5000, and a modern Neko CFD application. With the help of the VerifiCarlo tool and computer arithmetic techniques, we introduce a strategy to address stagnation issues in the preconditioned Conjugate Gradient method in Nekbone and apply these insights to implement a mixed-precision version of Neko. We evaluate the derived mixed-precision versions of these codes by combining metrics in three dimensions: accuracy, time-to-solution, and energy-to-solution. Notably, mixed-precision in Nekbone reduces time-to-solution by roughly 38% and energy-to-solution by 2.8x on MareNostrum 5, while in the real-world Neko application the gain is up to 29% in time and up to 24% in energy, without sacrificing the accuracy.
翻译:混合精度计算具有显著降低百亿亿次计算成本的潜力,但确定在程序中何时以及如何实施它可能具有挑战性。本文提出了一种借助计算机算术工具、屋顶线模型和计算机算术技术来实现混合精度的方法。作为案例研究,我们考虑了计算流体动力学求解器Nek5000的微型应用程序Nekbone,以及现代CFD应用程序Neko。借助VerifiCarlo工具和计算机算术技术,我们提出了一种策略来解决Nekbone中预处理共轭梯度法的停滞问题,并应用这些见解实现了Neko的混合精度版本。我们通过结合三个维度的指标来评估这些代码的混合精度版本:精度、求解时间和求解能耗。值得注意的是,在MareNostrum 5上,Nekbone的混合精度实现将求解时间减少了约38%,求解能耗降低了2.8倍;而在实际应用Neko中,在不牺牲精度的情况下,时间增益高达29%,能耗增益高达24%。