An Ensemble Score Filter for Tracking High-Dimensional Nonlinear Dynamical Systems

We propose an ensemble score filter (EnSF) for solving high-dimensional nonlinear filtering problems with superior accuracy. A major drawback of existing filtering methods, e.g., particle filters or ensemble Kalman filters, is the low accuracy in handling high-dimensional and highly nonlinear problems. EnSF attacks this challenge by exploiting the score-based diffusion model, defined in a pseudo-temporal domain, to characterizing the evolution of the filtering density. EnSF stores the information of the recursively updated filtering density function in the score function, in stead of storing the information in a set of finite Monte Carlo samples (used in particle filters and ensemble Kalman filters). Unlike existing diffusion models that train neural networks to approximate the score function, we develop a training-free score estimation that uses mini-batch-based Monte Carlo estimator to directly approximate the score function at any pseudo-spatial-temporal location, which provides sufficient accuracy in solving high-dimensional nonlinear problems as well as saves tremendous amount of time spent on training neural networks. Another essential aspect of EnSF is its analytical update step, gradually incorporating data information into the score function, which is crucial in mitigating the degeneracy issue faced when dealing with very high-dimensional nonlinear filtering problems. High-dimensional Lorenz systems are used to demonstrate the performance of our method. EnSF provides surprisingly impressive performance in reliably tracking extremely high-dimensional Lorenz systems (up to 1,000,000 dimension) with highly nonlinear observation processes, which is a well-known challenging problem for existing filtering methods.

翻译：我们提出了一种集成分数滤波器（EnSF），用于以极高精度求解高维非线性滤波问题。现有滤波方法（如粒子滤波器或集成卡尔曼滤波器）的主要缺点是在处理高维和高度非线性问题时精度较低。EnSF通过利用基于分数的扩散模型（定义在伪时间域中）来描述滤波密度的演化，从而攻克了这一挑战。EnSF将递归更新的滤波密度函数信息存储在分数函数中，而非存储在一组有限的蒙特卡洛样本中（如粒子滤波器和集成卡尔曼滤波器采用的方案）。与现有通过训练神经网络来近似分数函数的扩散模型不同，我们开发了一种无训练分数估计方法，使用基于小批量的蒙特卡洛估计器直接近似任意伪时空位置处的分数函数，这既为求解高维非线性问题提供了足够精度，又节省了训练神经网络所需的大量时间。EnSF的另一关键方面是其解析更新步骤，该步骤逐步将数据信息融入分数函数，这对于缓解处理极高维非线性滤波问题时面临的退化问题至关重要。我们使用高维洛伦兹系统来展示该方法的性能。EnSF在可靠跟踪具有高度非线性观测过程的极高维洛伦兹系统（最高达100万维）方面表现出惊人性能，而这一问题对现有滤波方法而言是公认的难题。