First-principles density functional theory (DFT) with plane wave (PW) basis set is the most widely used method in quantum mechanical material simulations due to its advantages in accuracy and universality. However, a perceived drawback of PW-based DFT calculations is their substantial computational cost and memory usage, which currently limits their ability to simulate large-scale complex systems containing thousands of atoms. This situation is exacerbated in the new Sunway supercomputer, where each process is limited to a mere 16 GB of memory. Herein, we present a novel parallel implementation of plane wave density functional theory on the new Sunway supercomputer (PWDFT-SW). PWDFT-SW fully extracts the benefits of Sunway supercomputer by extensively refactoring and calibrating our algorithms to align with the system characteristics of the Sunway system. Through extensive numerical experiments, we demonstrate that our methods can substantially decrease both computational costs and memory usage. Our optimizations translate to a speedup of 64.8x for a physical system containing 4,096 silicon atoms, enabling us to push the limit of PW-based DFT calculations to large-scale systems containing 16,384 carbon atoms.
翻译:基于平面波基组的密度泛函理论第一性原理计算,因其在精度与普适性方面的优势,已成为量子力学材料模拟中应用最广泛的方法。然而,平面波DFT计算通常被认为具有较高的计算成本和内存开销,这限制了其模拟包含数千个原子的大规模复杂体系的能力。在新一代神威超级计算机上,由于每个进程仅限16 GB内存,这一问题尤为突出。本文提出了一种在新一代神威超级计算机上实现的并行平面波密度泛函理论方法(PWDFT-SW)。PWDFT-SW通过全面重构和校准算法以适应神威系统的架构特性,充分发挥了该超级计算机的优势。大量数值实验表明,我们的方法能显著降低计算成本与内存使用量。针对包含4,096个硅原子的物理体系,我们的优化实现了64.8倍的加速,从而将平面波DFT计算的能力边界推进至包含16,384个碳原子的大规模体系。