Quantum Computational Superiority boasts rapid computation and high energy efficiency. Despite recent advances in classical algorithms aimed at refuting the milestone claim of Google's sycamore, challenges remain in generating uncorrelated samples of random quantum circuits. In this paper, we present a groundbreaking large-scale system technology that leverages optimization on global, node, and device levels to achieve unprecedented scalability for tensor networks. This enables the handling of large-scale tensor networks with memory capacities reaching tens of terabytes, surpassing memory space constraints on a single node. Our techniques enable accommodating large-scale tensor networks with up to tens of terabytes of memory, reaching up to 2304 GPUs with a peak computing power of 561 PFLOPS half-precision. Notably, we have achieved a time-to-solution of 14.22 seconds with energy consumption of 2.39 kWh which achieved fidelity of 0.002 and our most remarkable result is a time-to-solution of 17.18 seconds, with energy consumption of only 0.29 kWh which achieved a XEB of 0.002 after post-processing, outperforming Google's quantum processor Sycamore in both speed and energy efficiency, which recorded 600 seconds and 4.3 kWh, respectively.
翻译:量子计算优势以其快速计算和高能效著称。尽管近期旨在反驳谷歌Sycamore里程碑声明的经典算法取得了进展,但在生成随机量子电路的非相关样本方面仍存在挑战。本文提出了一种突破性的大规模系统技术,通过在全局、节点和设备层面进行优化,为张量网络实现了前所未有的可扩展性。这使得我们能够处理内存容量达数十TB的大规模张量网络,突破了单节点内存空间的限制。我们的技术支持容纳内存高达数十TB的大规模张量网络,最多可扩展至2304个GPU,其峰值计算能力达到561 PFLOPS半精度。值得注意的是,我们实现了14.22秒的求解时间与2.39千瓦时的能耗,保真度达到0.002;而最显著的结果是经过后处理后,以17.18秒的求解时间和仅0.29千瓦时的能耗实现了0.002的XEB值,在速度和能效上均超越了谷歌量子处理器Sycamore(其记录分别为600秒和4.3千瓦时)。