In quantum machine learning (QML), classical data are often encoded as quantum pure states and processed directly as quantum representations, motivating representation-level generative modeling that samples new quantum states from an underlying pure-state ensemble rather than re-preparing them from perturbed classical inputs. However, extending \emph{score-based} diffusion models with well-defined reverse-time samplers to quantum pure-state ensembles remains challenging, due to the non-Euclidean geometry of the complex projective space $\mathbb{CP}^{d-1}$ and the intractability of transition densities. We propose \emph{Stochastic Schrödinger Diffusion Models} (SSDMs), an intrinsic score-based generative framework on $\mathbb{CP}^{d-1}$ endowed with the Fubini--Study (FS) metric. SSDMs formulate a forward Riemannian diffusion with a stochastic Schrödinger equation (SSE) realization, and derive reverse-time dynamics driven by the Riemannian score $\nabla_{\mathrm{FS}} \log p_t$. To enable training without analytic transition densities, we introduce a local-time objective based on a local Euclidean Ornstein--Uhlenbeck approximation in FS normal coordinates, yielding an analytic teacher score mapped back to the manifold. Experiments show that SSDMs faithfully capture target pure-state ensemble statistics, including observable moments, overlap-kernel MMD, and entanglement measures, and that SSDM-generated quantum representations improve downstream QML generalization via representation-level data augmentation.
翻译:在量子机器学习(QML)中,经典数据常被编码为量子纯态并直接作为量子表示进行处理,这催生了表示层面的生成式建模——从底层纯态系综中采样新量子态,而非从扰动的经典输入重新制备它们。然而,将具有良好定义逆向时间采样器的基于分数的扩散模型扩展到量子纯态系综仍面临挑战,原因在于复射影空间$\mathbb{CP}^{d-1}$的非欧几里得几何特性以及转移密度的不可解性。我们提出随机薛定谔扩散模型(SSDMs),这是一种在赋予Fubini–Study(FS)度量的$\mathbb{CP}^{d-1}$上基于内蕴分数的生成框架。SSDMs通过随机薛定谔方程(SSE)实现前向黎曼扩散,并推导出由黎曼分数$\nabla_{\mathrm{FS}} \log p_t$驱动的逆向时间动力学。为了在缺乏解析转移密度的情况下实现训练,我们引入基于FS正规坐标下局部欧几里得Ornstein–Uhlenbeck近似的局部时间目标函数,由此获得映射回流形的解析教师分数。实验表明,SSDMs能够忠实捕捉目标纯态系综的统计特性,包括可观测量矩、重叠核MMD以及纠缠度量,且SSDM生成的量子表示通过表示层面数据增强可提升下游QML的泛化性能。