Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry

Neural-network quantum states (NQS) has emerged as a powerful application of quantum-inspired deep learning for variational Monte Carlo methods, offering a competitive alternative to existing techniques for identifying ground states of quantum problems. A significant advancement toward improving the practical scalability of NQS has been the incorporation of autoregressive models, most recently transformers, as variational ansatze. Transformers learn sequence information with greater expressiveness than recurrent models, but at the cost of increased time complexity with respect to sequence length. We explore the use of the retentive network (RetNet), a recurrent alternative to transformers, as an ansatz for solving electronic ground state problems in $\textit{ab initio}$ quantum chemistry. Unlike transformers, RetNets overcome this time complexity bottleneck by processing data in parallel during training, and recurrently during inference. We give a simple computational cost estimate of the RetNet and directly compare it with similar estimates for transformers, establishing a clear threshold ratio of problem-to-model size past which the RetNet's time complexity outperforms that of the transformer. Though this efficiency can comes at the expense of decreased expressiveness relative to the transformer, we overcome this gap through training strategies that leverage the autoregressive structure of the model -- namely, variational neural annealing. Our findings support the RetNet as a means of improving the time complexity of NQS without sacrificing accuracy. We provide further evidence that the ablative improvements of neural annealing extend beyond the RetNet architecture, suggesting it would serve as an effective general training strategy for autoregressive NQS.

翻译：神经量子态（NQS）作为量子启发的深度学习在变分蒙特卡洛方法中的一项强大应用，已成为识别量子问题基态的一项有竞争力的替代技术。提升NQS实际可扩展性的一个重要进展是引入了自回归模型，尤其是最近将Transformer作为变分拟设。Transformer学习序列信息时比循环模型具有更强的表达能力，但代价是相对于序列长度的时间复杂度增加。我们探索使用保留网络（RetNet）——一种Transformer的循环替代方案——作为求解从头算量子化学中电子基态问题的拟设。与Transformer不同，RetNet通过在训练期间并行处理数据，在推理期间循环处理数据，克服了这一时间复杂度瓶颈。我们给出了RetNet的简单计算成本估计，并直接将其与Transformer的类似估计进行比较，确立了一个明确的问题规模与模型规模之比阈值，超过该阈值后RetNet的时间复杂度优于Transformer。尽管这种效率可能以相对于Transformer表达能力下降为代价，但我们通过利用模型自回归结构的训练策略——即变分神经退火——克服了这一差距。我们的研究结果支持RetNet作为在不牺牲精度的情况下改进NQS时间复杂度的一种手段。我们进一步提供的证据表明，神经退火的消融改进超越了RetNet架构本身，这表明它可以作为自回归NQS的一种有效的通用训练策略。