We propose fork-join and task-based hybrid implementations of four classical linear algebra iterative methods (Jacobi, Gauss-Seidel, conjugate gradient and biconjugate gradient stabilised) as well as variations of them. Algorithms are duly documented and the corresponding source code is made publicly available for reproducibility. Both weak and strong scalability benchmarks are conducted to statistically analyse their relative efficiencies. The weak scalability results assert the superiority of a task-based hybrid parallelisation over MPI-only and fork-join hybrid implementations. Indeed, the task-based model is able to achieve speedups of up to 25% larger than its MPI-only counterpart depending on the numerical method and the computational resources used. For strong scalability scenarios, hybrid methods based on tasks remain more efficient with moderate computational resources where data locality does not play an important role. Fork-join hybridisation often yields mixed results and hence does not present a competitive advantage over a much simpler MPI approach.
翻译:我们提出了四种经典线性代数迭代方法(Jacobi、Gauss-Seidel、共轭梯度法和稳定双共轭梯度法)及其变体的fork-join与基于任务的混合实现。本文详细记录了相关算法,并公开了对应的源代码以确保可复现性。我们分别进行了弱扩展性与强扩展性基准测试,以统计分析其相对效率。弱扩展性结果表明,基于任务的混合并行化方案优于仅使用MPI和fork-join混合实现。事实上,根据数值方法及计算资源的不同,基于任务的模型相比仅使用MPI的对应方法可实现高达25%的加速比提升。在强扩展性场景中,基于任务的混合方法在数据局部性影响较小的中等计算资源条件下仍保持更高效率。而fork-join混合策略通常结果参差不齐,因此相较于更简单的MPI方法并未呈现竞争优势。