FQsun: A Configurable Wave Function-Based Quantum Emulator for Power-Efficient Quantum Simulations

Quantum computing has emerged as a powerful tool for solving complex computational problems, but access to real quantum hardware remains limited due to high costs and increasing demand for efficient quantum simulations. Unfortunately, software simulators on CPUs/GPUs such as Qiskit, ProjectQ, and Qsun offer flexibility and support for a large number of qubits, they struggle with high power consumption and limited processing speed, especially as qubit counts scale. Accordingly, quantum emulators implemented on dedicated hardware, such as FPGAs and analog circuits, offer a promising path for addressing energy efficiency concerns. However, existing studies on hardware-based emulators still face challenges in terms of limited flexibility, lack of fidelity evaluation, and power consumption. To overcome these gaps, we propose FQsun, a quantum emulator that enhances performance by integrating four key innovations: efficient memory organization, a configurable Quantum Gate Unit (QGU), optimized scheduling, and multiple number precisions. Five FQsun versions with different number precisions, including 16-bit floating point, 32-bit floating point, 16-bit fixed point, 24-bit fixed point, and 32-bit fixed point, are implemented on the Xilinx ZCU102 FPGA, utilizing between 9,226 and 18,093 LUTs, 1,440 and 7,031 FFs, 344 and 464 BRAMs, and 14 and 88 DSPs and consuming a maximum power of 2.41W. Experimental results demonstrate high accuracy in normalized gate speed, fidelity, and mean square error, particularly with 32-bit fixed-point and floating-point versions, establishing FQsun's capability as a precise quantum emulator. Benchmarking on quantum algorithms such as Quantum Fourier Transform, Parameter-Shift Rule, and Random Quantum Circuits reveals that FQsun achieves superior power-delay product, outperforming traditional software simulators on powerful CPUs by up to 9,870 times.

翻译：量子计算已成为解决复杂计算问题的强大工具，但由于高昂成本及对高效量子仿真日益增长的需求，实际量子硬件的获取仍然受限。尽管基于CPU/GPU的软件模拟器（如Qiskit、ProjectQ和Qsun）具有灵活性并支持大量量子比特，但其功耗高且处理速度受限，尤其在量子比特数量扩展时更为突出。因此，在专用硬件（如FPGA和模拟电路）上实现的量子模拟器为解决能效问题提供了可行路径。然而，现有基于硬件的模拟器研究仍面临灵活性不足、保真度评估缺失及功耗控制等挑战。为克服这些不足，本文提出FQsun量子模拟器，它通过集成四项关键创新来提升性能：高效内存组织、可配置量子门单元（QGU）、优化调度机制及多精度数值表示。在Xilinx ZCU102 FPGA上实现了五种不同数值精度的FQsun版本（包括16位浮点、32位浮点、16位定点、24定点和32位定点），其资源占用范围为9,226至18,093个LUT、1,440至7,031个FF、344至464个BRAM、14至88个DSP，最大功耗为2.41W。实验结果表明，该模拟器在归一化门速度、保真度和均方误差方面具有高精度，特别是32位定点与浮点版本，验证了FQsun作为精确量子模拟器的能力。通过对量子傅里叶变换、参数平移规则和随机量子电路等量子算法进行基准测试，发现FQsun实现了更优的功耗-延时乘积，相比传统CPU软件模拟器的性能提升最高达9,870倍。