Selected Basis Diagonalization (SBD) plays a central role in Sample-based Quantum Diagonalization (SQD), where iterative diagonalization of the Hamiltonian in selected configuration subspaces forms the dominant classical workload. We present a GPU-accelerated implementation of SBD using the Thrust library. By restructuring key components -- including configuration processing, excitation generation, and matrix-vector operations -- around fine-grained data-parallel primitives and flattened GPU-friendly data layouts, the proposed approach efficiently exploits modern GPU architectures. In our experiments, the Thrust-based SBD achieves up to $\sim$40$\times$ speedup over CPU execution and substantially reduces the total runtime of SQD iterations. These results demonstrate that GPU-native parallel primitives provide a simple, portable, and high-performance foundation for accelerating SQD-based quantum-classical workflows.
翻译:选择基对角化(SBD)在基于样本的量子对角化(SQD)方法中起着核心作用,其中在选定构型子空间中对哈密顿量进行迭代对角化构成了经典计算的主要负载。本文提出了一种使用Thrust库实现的GPU加速SBD方法。通过围绕细粒度数据并行原语和扁平化的GPU友好数据布局重构关键组件——包括构型处理、激发生成和矩阵-向量运算——所提出的方法能高效利用现代GPU架构。实验表明,基于Thrust的SBD相比CPU执行实现了高达$\sim$40$\times$的加速比,并显著缩短了SQD迭代的总运行时间。这些结果证明,GPU原生并行原语为加速基于SQD的量子-经典混合计算流程提供了简洁、可移植且高性能的基础框架。