A fault-tolerant quantum computer must decode and correct errors faster than they appear to prevent exponential slowdown due to error correction. The Union-Find (UF) decoder is promising with an average time complexity slightly higher than $O(d^3)$. We report a distributed version of the UF decoder that exploits parallel computing resources for further speedup. Using an FPGA-based implementation, we empirically show that this distributed UF decoder has a sublinear average time complexity with regard to $d$, given $O(d^3)$ parallel computing resources. The decoding time per measurement round decreases as $d$ increases, the first time for a quantum error decoder. The implementation employs a scalable architecture called Helios that organizes parallel computing resources into a hybrid tree-grid structure. Using a Xilinx VCU129 FPGA, we successfully implement $d$ up to 21 with an average decoding time of 11.5 ns per measurement round under 0.1\% phenomenological noise, and 23.7 ns for $d=17$ under equivalent circuit-level noise. This performance is significantly faster than any existing decoder implementation. Furthermore, we show that Helios can optimize for resource efficiency by decoding $d=51$ on a Xilinx VCU129 FPGA with an average latency of 544ns per measurement round.
翻译:容错量子计算机必须比错误出现更快地解码和纠正错误,以防止纠错导致的指数级减速。并查(UF)解码器具有平均时间复杂度略高于$O(d^3)$的优势。我们提出了一种分布式UF解码器,它利用并行计算资源实现进一步加速。通过基于FPGA的实现,我们经验性地证明,在给定$O(d^3)$并行计算资源的情况下,该分布式UF解码器具有关于$d$的次线性平均时间复杂度。每个测量轮次的解码时间随$d$增大而减少,这在量子纠错解码器中尚属首次。该实现采用了一种名为Helios的可扩展架构,将并行计算资源组织成混合树-网格结构。使用Xilinx VCU129 FPGA,我们成功实现了$d$最大至21的解码,在0.1%现象学噪声下,每个测量轮次的平均解码时间为11.5纳秒;在等效电路级噪声下,$d=17$时的平均解码时间为23.7纳秒。该性能显著快于任何现有解码器实现。此外,我们展示了Helios可通过资源优化,在Xilinx VCU129 FPGA上以每个测量轮次544纳秒的平均延迟解码$d=51$。