Inferring a diffusion equation from discretely-observed measurements is a statistical challenge of significant importance in a variety of fields, from single-molecule tracking in biophysical systems to modeling financial instruments. Assuming that the underlying dynamical process obeys a $d$-dimensional stochastic differential equation of the form $$\mathrm{d}\boldsymbol{x}_t=\boldsymbol{b}(\boldsymbol{x}_t)\mathrm{d} t+\Sigma(\boldsymbol{x}_t)\mathrm{d}\boldsymbol{w}_t,$$ we propose neural network-based estimators of both the drift $\boldsymbol{b}$ and the spatially-inhomogeneous diffusion tensor $D = \Sigma\Sigma^{T}$ and provide statistical convergence guarantees when $\boldsymbol{b}$ and $D$ are $s$-H\"older continuous. Notably, our bound aligns with the minimax optimal rate $N^{-\frac{2s}{2s+d}}$ for nonparametric function estimation even in the presence of correlation within observational data, which necessitates careful handling when establishing fast-rate generalization bounds. Our theoretical results are bolstered by numerical experiments demonstrating accurate inference of spatially-inhomogeneous diffusion tensors.
翻译:从离散观测数据中推断扩散方程是众多领域面临的重要统计挑战,涵盖从生物物理系统中的单分子追踪到金融工具建模等领域。假定底层动力学过程服从形式为 $$\mathrm{d}\boldsymbol{x}_t=\boldsymbol{b}(\boldsymbol{x}_t)\mathrm{d} t+\Sigma(\boldsymbol{x}_t)\mathrm{d}\boldsymbol{w}_t$$ 的 $d$ 维随机微分方程,我们提出了基于神经网络的漂移项 $\boldsymbol{b}$ 与空间非齐次扩散张量 $D = \Sigma\Sigma^{T}$ 的估计器,并在 $\boldsymbol{b}$ 和 $D$ 满足 $s$ 阶Hölder连续条件时给出了统计收敛性保证。值得注意的是,即使在观测数据存在相关性的情况下(这需要在建立快速率泛化界时审慎处理),我们的界仍与非参数函数估计的极小极大最优速率 $N^{-\frac{2s}{2s+d}}$ 一致。数值实验验证了空间非齐次扩散张量推断的准确性,进一步支撑了我们的理论结果。