Unitary Synthesis, the decomposition of a unitary matrix into a sequence of quantum gates, is a fundamental challenge in quantum compilation. Prevailing reinforcement learning(RL) approaches are often hampered by sparse reward signals, which necessitate complex reward shaping or long training times, and typically converge to a single policy, lacking solution diversity. In this work, we propose QFlowNet, a novel framework that learns efficiently from sparse signals by pairing a Generative Flow Network (GFlowNet) with Transformers. Our approach addresses two key challenges. First, the GFlowNet framework is fundamentally designed to learn a diverse policy that samples solutions proportional to their reward, overcoming the single-solution limitation of RL while offering faster inference than other generative models like diffusion. Second, the Transformers act as a powerful encoder, capturing the non-local structure of unitary matrices and compressing a high-dimensional state into a dense latent representation for the policy network. Our agent achieves an overall success rate of 99.7% on a 3-qubit benchmark(lengths 1-12) and discovers a diverse set of compact circuits, establishing QFlowNet as an efficient and diverse paradigm for unitary synthesis.
翻译:酉合成,即将酉矩阵分解为量子门序列,是量子编译中的一个基础性挑战。主流的强化学习方法常受稀疏奖励信号的阻碍,这需要复杂的奖励塑形或较长的训练时间,且通常收敛于单一策略,缺乏解多样性。在本工作中,我们提出QFlowNet,一种将生成流网络与Transformer配对、从而能从稀疏信号中高效学习的新框架。我们的方法解决了两个关键挑战。首先,GFlowNet框架从根本上被设计为学习一种多样化策略,该策略按奖励比例采样解,克服了强化学习的单一解限制,同时提供了比扩散等其他生成模型更快的推理速度。其次,Transformer充当强大的编码器,捕捉酉矩阵的非局部结构,并将高维状态压缩为策略网络的稠密潜在表示。我们的智能体在3量子比特基准测试(长度1-12)上实现了99.7%的总成功率,并发现了一组多样化的紧凑电路,从而确立了QFlowNet作为一种高效且多样化的酉合成范式。