Data assimilation (DA) combines partial observations with a dynamical model to improve state estimation. Filter-based DA uses only past and present data and is the prerequisite for real-time forecasts. Smoother-based DA exploits both past and future observations. It aims to fill in missing data, provide more accurate estimations, and develop high-quality datasets. However, the standard smoothing procedure requires using all historical state estimations, which is storage-demanding, especially for high-dimensional systems. This paper develops an adaptive-lag online smoother for a large class of complex dynamical systems with strong nonlinear and non-Gaussian features, which has important applications to many real-world problems. The adaptive lag allows the DA to utilize only observations within a nearby window, significantly reducing computational storage. Online lag adjustment is essential for tackling turbulent systems, where temporal autocorrelation varies significantly over time due to intermittency, extreme events, and nonlinearity. Based on the uncertainty reduction in the estimated state, an information criterion is developed to systematically determine the adaptive lag. Notably, the mathematical structure of these systems facilitates the use of closed analytic formulae to calculate the online smoother and the adaptive lag, avoiding empirical tunings as in ensemble-based DA methods. The adaptive online smoother is applied to studying three important scientific problems. First, it helps detect online causal relationships between state variables. Second, its advantage of computational storage is illustrated via Lagrangian DA, a high-dimensional nonlinear problem. Finally, the adaptive smoother advances online parameter estimation with partial observations, emphasizing the role of the observed extreme events in accelerating convergence.
翻译:数据同化(DA)结合部分观测与动力学模型以改进状态估计。基于滤波器的DA仅使用过去和当前数据,是实时预报的前提。基于平滑器的DA则同时利用过去与未来观测,旨在填补缺失数据、提供更精确估计并构建高质量数据集。然而,标准平滑流程需使用全部历史状态估计,对存储要求极高,尤其在高维系统中。本文针对一大类具有强非线性和非高斯特征的复杂动力系统,开发了一种自适应滞后的在线平滑器,其在众多现实问题中具有重要应用价值。自适应滞后机制使DA仅需利用邻近时间窗口内的观测数据,显著降低了计算存储需求。在线滞后调整对于处理湍流系统至关重要,这类系统因间歇性、极端事件和非线性导致时间自相关性随时间剧烈变化。基于估计状态的不确定性降低程度,本文开发了一种信息准则以系统化确定自适应滞后量。值得注意的是,这些系统的数学结构使得在线平滑器与自适应滞后的计算可采用封闭解析公式,避免了基于集合的DA方法中常见的经验调参过程。该自适应在线平滑器被应用于研究三个重要科学问题:首先,它有助于在线检测状态变量间的因果关系;其次,通过拉格朗日DA这一高维非线性问题展示了其计算存储优势;最后,该自适应平滑器推进了基于部分观测的在线参数估计,并强调了观测到的极端事件在加速收敛中的作用。