We propose a parallel (distributed) version of the spectral proper orthogonal decomposition (SPOD) technique. The parallel SPOD algorithm distributes the spatial dimension of the dataset preserving time. This approach is adopted to preserve the non-distributed fast Fourier transform of the data in time, thereby avoiding the associated bottlenecks. The parallel SPOD algorithm is implemented in the PySPOD (https://github.com/MathEXLab/PySPOD) library and makes use of the standard message passing interface (MPI) library, implemented in Python via mpi4py (https://mpi4py.readthedocs.io/en/stable/). An extensive performance evaluation of the parallel package is provided, including strong and weak scalability analyses. The open-source library allows the analysis of large datasets of interest across the scientific community. Here, we present applications in fluid dynamics and geophysics, that are extremely difficult (if not impossible) to achieve without a parallel algorithm. This work opens the path toward modal analyses of big quasi-stationary data, helping to uncover new unexplored spatio-temporal patterns.
翻译:我们提出了一种并行(分布式)版本的谱本征正交分解(SPOD)技术。该并行SPOD算法对数据集的空间维度进行分布式处理,同时保持时间维度不变。采用此方法是为了保留数据在时间维度上的非分布式快速傅里叶变换,从而避免相关的性能瓶颈。该并行SPOD算法在PySPOD(https://github.com/MathEXLab/PySPOD)库中实现,并利用了标准消息传递接口(MPI)库,后者通过mpi4py(https://mpi4py.readthedocs.io/en/stable/)在Python中实现。本文对并行软件包进行了全面的性能评估,包括强可扩展性和弱可扩展性分析。这一开源库使得科学界能够分析所关注的大型数据集。在此,我们展示了在流体动力学和地球物理学中的应用案例,这些应用在没有并行算法的情况下是极其困难(甚至不可能)实现的。此项工作为大型准稳态数据的模态分析开辟了道路,有助于揭示尚未探索的新时空模式。