We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss-Seidel, conjugate gradient and biconjugate gradient stabilised) as well as variations of them. Algorithms are duly documented and the corresponding source code is made publicly available for reproducibility. Both weak and strong scalability benchmarks are conducted to statistically analyse their relative efficiencies. The weak scalability results assert the superiority of a task-based hybrid parallelisation over MPI-only and fork-join hybrid implementations. Indeed, the task-based model is able to achieve speedups of up to 25% larger than its MPI-only counterpart depending on the numerical method and the computational resources used. For strong scalability scenarios, hybrid methods based on tasks remain more efficient with moderate computational resources where data locality does not play an important role. Fork-join hybridisation often yields mixed results and hence does not present a competitive advantage over a much simpler MPI approach.
翻译:我们提出了四种经典线性代数迭代方法(Jacobi法、Gauss-Seidel法、共轭梯度法及稳定化双共轭梯度法)及其变体的fork-join与基于任务的混合实现。我们对算法进行了充分记载,并公开发布相应源代码以保障可重复性。通过弱扩展性与强扩展性基准测试,我们对其相对效率进行了统计分析。弱扩展性结果表明,基于任务的混合并行化在性能上优于纯MPI及fork-join混合实现。事实上,根据数值方法及计算资源的不同,基于任务的模型相较于纯MPI方法可实现高达25%的加速比。在强扩展性场景中,基于任务的混合方法在数据局部性影响较小的中等计算资源条件下仍保持更高效率。而fork-join混合化方案结果常不一致,因此相较于更简单的MPI方法未呈现竞争优势。