A Data-Consistent Approach to Ensemble Filtering

Ensemble filtering of chaotic, partially observed systems is often performed with ensembles far smaller than the state dimension resulting in empirical covariances that are low rank. Subsequently, stochastic observation perturbations can degrade both accuracy and probabilistic calibration. We develop a data-consistent perspective on ensemble filtering and introduce the Quantity-of-Interest Principal Component Analysis Ensemble Data Consistent Filter (QPCA-EnDCF), which is a deterministic method that replaces perturbed observations with a spectrally regularized update in observation space. The method whitens forecast--observation residuals, computes an empirical eigendecomposition of the residual covariance, and restricts the correction to a rank-$κ$ subspace before mapping the increment back to state space through an empirical gain. We establish a theoretical framework that separates population and finite-ensemble objects and yields a bias--variance decomposition for the analysis mean. The analysis shows that stochastic EnKF variants incur an irreducible $\mathcal{O}(1/N)$ variance contribution from observation perturbations, whereas QPCA-EnDCF replaces this term with projector-estimation variability that is also $\mathcal{O}(1/N)$ but depends on the retained rank and the cutoff gap through eigenspace stability. Numerical experiments on the Lorenz--96 system in strongly undersampled regimes demonstrate that QPCA-EnDCF substantially improves spread--skill behavior, temporal tracking between spread and error, and rank-histogram reliability relative to sequential and four-dimensional stochastic EnKF. Under the baseline configuration, these calibration gains are accompanied by lower RMSE.

翻译：针对混沌、部分可观测系统的集合滤波通常采用远小于状态维度的集合，导致经验协方差秩较低。进而，随机观测扰动会降低精度和概率校准效果。我们提出数据一致的集合滤波视角，并引入感兴趣量主成分分析集合数据一致滤波（QPCA-EnDCF），这是一种确定性方法，通过在观测空间中进行谱正则化更新替代扰动观测。该方法对预报-观测残差进行白化处理，计算残差协方差的经验特征分解，在将修正量通过经验增益映射回状态空间前，限制其位于秩-κ子空间内。我们建立了分离总体与有限集合对象的理论框架，并得到分析均值的偏差-方差分解。分析表明，随机EnKF变体因观测扰动会产生不可约的$\mathcal{O}(1/N)$方差贡献，而QPCA-EnDCF将其替换为同样为$\mathcal{O}(1/N)$、但通过特征空间稳定性依赖于保留秩和截断间隔的投影估计变异性。在强欠采样条件下对Lorenz-96系统的数值实验表明，相较于序列和四维变分随机EnKF，QPCA-EnDCF显著改善了扩展-技能行为、扩展与误差间的时间跟踪以及秩直方图可靠性。在基准配置下，这些校准增益伴随更低的RMSE。