We consider the setting of multiscale overdamped Langevin stochastic differential equations, and study the problem of learning the drift function of the homogenized dynamics from continuous-time observations of the multiscale system. We decompose the drift term in a truncated series of basis functions, and employ the stochastic gradient descent in continuous time to infer the coefficients of the expansion. Due to the incompatibility between the multiscale data and the homogenized model, the estimator alone is not able to reconstruct the exact drift. We therefore propose to filter the original trajectory through appropriate kernels and include filtered data in the stochastic differential equation for the estimator, which indeed solves the misspecification issue. Several numerical experiments highlight the accuracy of our approach. Moreover, we show theoretically in a simplified framework the asymptotic unbiasedness of our estimator in the limit of infinite data and when the multiscale parameter describing the fastest scale vanishes.
翻译:我们考虑多尺度过阻尼朗之万随机微分方程的背景,研究从多尺度系统的连续时间观测中学习均质化动力学漂移函数的问题。我们将漂移项分解为基函数截断级数,并采用连续时间随机梯度下降法来推断展开系数。由于多尺度数据与均质化模型之间的不兼容性,仅凭估计器无法精确重构漂移函数。因此,我们提出通过适当的核函数对原始轨迹进行滤波,并将滤波数据纳入估计器的随机微分方程中,这确实解决了模型误设问题。多项数值实验凸显了我们方法的准确性。此外,我们在简化框架下从理论上证明了,在无限数据极限和描述最快尺度的多尺度参数趋于零时,我们估计量的渐近无偏性。