We address data-driven learning of the infinitesimal generator of stochastic diffusion processes, essential for understanding numerical simulations of natural and physical systems. The unbounded nature of the generator poses significant challenges, rendering conventional analysis techniques for Hilbert-Schmidt operators ineffective. To overcome this, we introduce a novel framework based on the energy functional for these stochastic processes. Our approach integrates physical priors through an energy-based risk metric in both full and partial knowledge settings. We evaluate the statistical performance of a reduced-rank estimator in reproducing kernel Hilbert spaces (RKHS) in the partial knowledge setting. Notably, our approach provides learning bounds independent of the state space dimension and ensures non-spurious spectral estimation. Additionally, we elucidate how the distortion between the intrinsic energy-induced metric of the stochastic diffusion and the RKHS metric used for generator estimation impacts the spectral learning bounds.
翻译:我们解决了随机扩散过程无穷小生成元的数据驱动学习问题,这对于理解自然与物理系统的数值模拟至关重要。生成元的无界性带来了重大挑战,导致针对希尔伯特-施密特算子的传统分析方法失效。为克服这一困难,我们提出了一种基于随机过程能量泛函的新框架。该方法通过在完全知识与部分知识两种设定下引入基于能量的风险度量来整合物理先验信息。我们评估了部分知识设定中再生核希尔伯特空间(RKHS)下降秩估计量的统计性能。值得注意的是,我们的方法提供了与状态空间维度无关的学习界,并确保谱估计无伪影。此外,我们阐明了随机扩散内在能量诱导度量与用于生成元估计的RKHS度量之间的扭曲如何影响谱学习界。