EIRENE [1] is a Monte Carlo neutral transport solver heavily used in the fusion community. EIRENE does not implement domain decomposition, making it impossible to use for simulations where the grid data does not fit on one compute node (see e.g. [2]). This paper presents a domain-decomposed Monte Carlo (DDMC) algorithm implemented in a new open source Monte Carlo code, Eiron. Two parallel algorithms currently used in EIRENE are also implemented in Eiron, and the three algorithms are compared by running strong scaling tests, with DDMC performing better than the other two algorithms in nearly all cases. On the supercomputer Mahti [3], DDMC strong scaling is superlinear for grids that do not fit into an L3 cache slice (4 MiB). The DDMC algorithm is also scaled up to 16384 cores in weak scaling tests, with a weak scaling efficiency of 45% in a high-collisional (heavier compute load) case, and 26% in a low-collisional (lighter compute load) case. We conclude that implementing this domain decomposition algorithm in EIRENE would improve performance and enable simulations that are currently impossible due to memory constraints.
翻译:EIRENE [1] 是聚变界广泛使用的蒙特卡罗中性粒子输运求解器。EIRENE 未实现区域分解,因此当网格数据无法容纳于单个计算节点时(例如参见文献[2]),无法用于模拟。本文提出一种区域分解蒙特卡罗(DDMC)算法,并在新型开源蒙特卡罗代码 Eiron 中实现。EIRENE 当前使用的两种并行算法也在 Eiron 中实现,通过强扩展性测试对三种算法进行比较,结果表明 DDMC 算法在几乎所有情况下性能优于其他两种算法。在超算 Mahti [3] 上,对于无法装入 L3 缓存片(4 MiB)的网格,DDMC 强扩展性呈现超线性特征。弱扩展性测试中,DDMC 算法可扩展至 16384 核:在高碰撞(计算负载较重)场景下弱扩展效率为 45%,在低碰撞(计算负载较轻)场景下为 26%。我们得出结论:在 EIRENE 中实现该区域分解算法将提升性能,并使得当前因内存限制而无法实现的模拟成为可能。