We consider the problem of synthesizing Clifford quantum circuits for devices with all-to-all qubit connectivity. We approach this task as a reinforcement learning problem in which an agent learns to discover a sequence of elementary Clifford gates that reduces a given symplectic matrix representation of a Clifford circuit to the identity. This formulation permits a simple learning curriculum based on random walks from the identity. We introduce a novel neural network architecture that is equivariant to qubit relabelings of the symplectic matrix representation, and which is size-agnostic, allowing a single learned policy to be applied across different qubit counts without circuit splicing or network reparameterization. On six-qubit Clifford circuits, the largest regime for which optimal references are available, our agent finds circuits within one two-qubit gate of optimality in milliseconds per instance, and finds optimal circuits in 99.2% of instances within seconds per instance. After continued training on ten-qubit instances, the agent scales to unseen Clifford tableaus with up to thirty qubits, including targets generated from circuits with over a thousand Clifford gates, where it achieves lower average two-qubit gate counts than Qiskit's Aaronson-Gottesman and greedy Clifford synthesizers.
翻译:我们研究了在全连接量子比特设备上合成克利福德量子电路的问题。我们将此任务视为一个强化学习问题,其中智能体学习发现一系列基本克利福德门,将给定的克利福德电路辛矩阵表示约化为单位矩阵。这种表述允许基于从单位矩阵出发的随机游走构建简单的学习课程。我们提出了一种新颖的神经网络架构,该架构对辛矩阵表示的量子比特重标号具有等变性,并且与尺寸无关,允许单一学习策略在不进行电路拼接或网络重新参数化的情况下应用于不同量子比特数量。在六量子比特克利福德电路(可获得最优参考的最大规模)上,我们的智能体每实例在毫秒内找到与最优值相差一个双量子比特门的电路,并在99.2%的实例中每实例在秒内找到最优电路。在持续对十量子比特实例进行训练后,该智能体可扩展到最多包含三十个量子比特的未见克利福德表格,包括由超过一千个克利福德门的电路生成的目标,其平均双量子比特门计数低于Qiskit的Aaronson-Gottesman和贪心克利福德合成器。