Multiple-precision floating-point branch-free algorithms can significantly accelerate multi-component arithmetic implemented by combining hardware-based binary64 and binary32, particularly for triple- and quadruple-precision computations. In this study, we achieved benchmark results on x86 and ARM CPU platforms to quantify the accelerations achieved in linear computations and polynomial evaluation by integrating these algorithms.
翻译:多精度浮点无分支算法能显著加速基于硬件binary64与binary32组合实现的多分量算术运算,对三精度与四精度计算尤为有效。本研究通过在x86与ARM CPU平台上取得的基准测试结果,量化了这些算法在线性计算与多项式求值中实现的加速效果。