Classical simulations are essential for the development of quantum computing, and their exponential scaling can easily fill any modern supercomputer. In this paper we consider the performance and energy consumption of large Quantum Fourier Transform (QFT) simulations run on ARCHER2, the UK's National Supercomputing Service, with QuEST toolkit. We take into account CPU clock frequency and node memory size, and use cache-blocking to rearrange the circuit, which minimises communications. We find that using 2.00GHz instead of 2.25GHz can save as much as 25% of energy at 5% increase in runtime. Higher node memory also has the potential to be more efficient, and cost the user fewer CUs, but at higher runtime penalty. Finally, we present a cache-blocking QFT circuit, which halves the required communication. All our optimisations combined result in 40% faster simulations and 35% energy savings in 44 qubit simulations on 4,096 ARCHER2 nodes.
翻译:经典模拟对量子计算的发展至关重要,但其指数级扩展能力极易填满任何现代超级计算机。本文以英国国家超算服务系统ARCHER2为平台,基于QuEST工具包运行大规模量子傅里叶变换(QFT)模拟,研究其性能与能耗。我们考虑了CPU时钟频率和节点内存大小的影响,并采用缓存分块技术重构量子电路以最小化通信开销。实验表明:将时钟频率从2.25GHz降至2.00GHz可在运行时仅增加5%的前提下节省25%的能耗;更高节点内存虽有望提升能效并降低用户计算单元(CU)消耗,但会带来更显著的运行时代价。最后,我们提出一种缓存分块QFT电路,可将所需通信量减半。综合所有优化措施,在4,096个ARCHER2节点上进行的44量子比特模拟实现了40%的加速与35%的能耗降低。