We present and compare distributed parallelization strategies for the particle-in-Fourier (PIF) schemes used in kinetic plasma simulations. The different strategies are i) domain decomposition, where both the particles and Fourier modes are split between the MPI ranks ii) particle decomposition, where only the particles are split between the ranks and each rank carries all the modes, and, iii) space-time decomposition, in which time parallelization based on the parareal algorithm is added on top of the particle decomposition. We describe the different communication patterns involved in each of the strategies, the parameter regimes where they work best, and explain their advantages and disadvantages. We implement the strategies within the open-source, performance portable library IPPL and conduct scaling studies with 3D-3V Landau damping and Penning trap benchmark problems on Alps and JUWELS booster supercomputers. We analyze the dominant component timings in each of the strategies and identify areas for future optimizations.
翻译:我们提出并比较了用于动力学等离子体模拟的粒子-傅里叶(PIF)方案的分布式并行化策略。不同的策略包括:i) 区域分解,其中粒子和傅里叶模态均在各MPI进程间划分;ii) 粒子分解,仅粒子在各进程间划分,且每个进程承载所有模态;以及,iii) 时空分解,即在粒子分解基础上,基于parareal算法增加时间并行化。我们描述了每种策略中涉及的不同通信模式、其最佳适用的参数范围,并解释了它们的优缺点。我们在开源、性能可移植库IPPL中实现了这些策略,并在Alps和JUWELS booster超级计算机上,利用三维-三维(3D-3V)朗道阻尼和彭宁阱基准问题开展了扩展性研究。我们分析了每种策略中主要组件的计时数据,并指出了未来优化的方向。