The majority of data assimilation (DA) methods in the geosciences are based on Gaussian assumptions. While these assumptions facilitate efficient algorithms, they cause analysis biases and subsequent forecast degradations. Non-parametric, particle-based DA algorithms have superior accuracy, but their application to high-dimensional models still poses operational challenges. Drawing inspiration from recent advances in the field of generative artificial intelligence (AI), this article introduces a new nonlinear estimation theory which attempts to bridge the existing gap in DA methodology. Specifically, a Conjugate Transform Filter (CTF) is derived and shown to generalize the celebrated Kalman filter to arbitrarily non-Gaussian distributions. The new filter has several desirable properties, such as its ability to preserve statistical relationships in the prior state and convergence to highly accurate observations. An ensemble approximation of the new theory (ECTF) is also presented and validated using idealized statistical experiments that feature bounded quantities with non-Gaussian distributions, a prevalent challenge in Earth system models. Results from these experiments indicate that the greatest benefits from ECTF occur when observation errors are small relative to the forecast uncertainty and when state variables exhibit strong nonlinear dependencies. Ultimately, the new filtering theory offers exciting avenues for improving conventional DA algorithms through their principled integration with AI techniques.
翻译:地球科学中的大多数数据同化方法都基于高斯假设。尽管这些假设促进了高效算法的实现,但会导致分析偏差并降低后续预报质量。非参数化、基于粒子的数据同化算法具有更高的精度,但其在高维模型中的应用仍面临操作层面的挑战。受生成式人工智能领域最新进展的启发,本文提出了一种新的非线性估计理论,旨在填补现有数据同化方法中的空白。具体而言,我们推导出共轭变换滤波器,并证明其将著名的卡尔曼滤波器推广至任意非高斯分布。该新滤波器具有若干理想特性,例如能够保持先验状态中的统计关系,并收敛于高精度观测。同时提出该理论的集合近似方法,并通过理想化统计实验进行验证,实验涉及地球系统模型中常见的具有非高斯分布的有界变量难题。实验结果表明,当观测误差相对于预报不确定性较小且状态变量呈现强非线性依赖关系时,ECTF能发挥最大优势。最终,这一新的滤波理论通过将传统数据同化算法与人工智能技术进行原则性整合,为改进现有数据同化方法开辟了令人振奋的新途径。