Massively parallel Fourier transforms are widely used in computational sciences, and specifically in computational fluid dynamics which involves unbounded Poisson problems. In practice the latter is usually the most time-consuming operation due to its inescapable all-to-all communication pattern. The original flups library tackles that issue with an implementation of the distributed Fourier transform tailor-made for successive resolutions of unbounded Poisson problems. However the proposed implementation lacks of flexibility as it only supports cell-centered data layout and features a plain communication strategy. This work extends the library along two directions. First, flups implementation is generalized to support a node-centered data layout. Second, three distinct approaches are provided to handle the communications: one all-to-all, and two non-blocking implementations relying on manual packing and MPI_Datatype to communicate over the network. The proposed software is validated against analytical solutions for unbounded, semi-unbounded, and periodic domains. The performance of the approaches is then compared against accFFT, another distributed FFT implementation, using a periodic case. Finally the performance metrics of each implementation are analyzed and detailed on various top-tier European facilities up to 49,152 cores. This work brings flups up to a fully production-ready and performant distributed FFT library, featuring all the possible types of FFTs and with flexibility in the data-layout. The code is available under a BSD-3 license at github.com/vortexlab-uclouvain/flups.
翻译:大规模并行傅里叶变换广泛应用于计算科学领域,尤其在涉及无界泊松问题的计算流体动力学中。实践中,由于不可避免的全交换通信模式,此类操作通常是最耗时的环节。原始flups库通过为无界泊松问题的逐次求解量身定制的分布式傅里叶变换实现,解决了这一难题。然而该实现缺乏灵活性,仅支持中心网格数据布局并采用简单的通信策略。本研究沿两个方向扩展了该库:首先,将flups的实现泛化以支持节点中心数据布局;其次,提供了三种不同的通信处理方案:一种全交换方法,以及两种分别基于手动打包和MPI_Datatype进行网络通信的非阻塞实现。针对无界、半无界和周期域,通过解析解验证了所提软件的正确性。随后,在周期算例下比较了该方法与另一分布式FFT实现accFFT的性能。最后,在多个欧洲顶级计算设施上(核心数达49,152个),详细分析和量化了每种实现的性能指标。本研究将flups升级为完全生产就绪的高性能分布式FFT库,涵盖所有FFT类型并具备数据布局灵活性。代码以BSD-3许可协议开源发布于github.com/vortexlab-uclouvain/flups。