Under the data manifold hypothesis, high-dimensional data are concentrated near a low-dimensional manifold. We study the problem of Riemannian optimization over such manifolds when they are given only implicitly through the data distribution, and the standard manifold operations required by classical algorithms are unavailable. This formulation captures a broad class of data-driven design problems that are central to modern generative AI. Our key idea is to introduce a link function that connects the data distribution to the geometric operations needed for optimization. We show that this function enables the recovery of essential manifold operations, such as retraction and Riemannian gradient computation. Moreover, we establish a direct connection between our construction and the score function in diffusion models of the data distribution. This connection allows us to leverage well-studied parameterizations, efficient training procedures, and even pretrained score networks from the diffusion model literature to perform optimization. Building on this foundation, we propose two efficient inference-time algorithms -- Denoising Landing Flow (DLF) and Denoising Riemannian Gradient Descent (DRGD) -- and provide theoretical guarantees for both feasibility (approximate manifold adherence) and optimality (small Riemannian gradient norm). Finally, we demonstrate the effectiveness of our approach on finite-horizon reference tracking tasks in data-driven control, highlighting its potential for practical generative and design applications.
翻译:在数据流形假设下,高维数据集中分布在低维流形附近。我们研究当此类流形仅通过数据分布隐式给出、且经典算法所需的流形标准运算不可用时,在其上进行的黎曼优化问题。这一形式化框架涵盖了现代生成式人工智能中核心的一类数据驱动设计问题。我们的核心思想是引入一个连接数据分布与优化所需几何运算的链接函数。我们证明该函数能够恢复关键的流形运算,如收缩映射和黎曼梯度计算。此外,我们在所构建的框架与数据分布的扩散模型中的分数函数之间建立了直接联系。这一联系使我们能够利用扩散模型文献中经过充分研究的参数化方法、高效训练流程乃至预训练的分数网络来执行优化。基于此基础,我们提出了两种高效的推理时算法——去噪着陆流(DLF)与去噪黎曼梯度下降(DRGD),并从可行性(近似流形贴合)和最优性(小黎曼梯度范数)两方面为两者提供了理论保证。最后,我们在数据驱动控制中的有限时域参考跟踪任务上验证了所提方法的有效性,彰显了其在实用生成与设计应用中的潜力。