We present a numerically-stable parallel-in-time linear Kalman smoother. The smoother uses a novel highly-parallel QR factorization for a class of structured sparse matrices for state estimation, and an adaptation of the SelInv selective-inversion algorithm to evaluate the covariance matrices of estimated states. Our implementation of the new algorithm, using the Threading Building Blocks (TBB) library, scales well on both Intel and ARM multi-core servers, achieving speedups of up to 47x on 64 cores. The algorithm performs more arithmetic than sequential smoothers; consequently it is 1.8x to 2.5x slower on a single core. The new algorithm is faster and scales better than the parallel Kalman smoother proposed by S\"arkk\"a and Garc\'{\i}a-Fern\'andez in 2021.
翻译:本文提出了一种数值稳定的并行时间线性卡尔曼平滑器。该平滑器采用一种新颖的高度并行QR分解方法,用于处理状态估计中一类结构化稀疏矩阵,并适配SelInv选择性求逆算法以计算估计状态的协方差矩阵。我们基于Threading Building Blocks(TBB)库实现的新算法在英特尔和ARM多核服务器上均展现出良好的扩展性,在64核环境下最高可实现47倍加速。该算法比顺序平滑器执行更多算术运算,因此在单核环境下速度较慢,约为顺序平滑器的1.8至2.5倍。相较于Särkkä和García-Fernández于2021年提出的并行卡尔曼平滑器,新算法具有更快的计算速度和更优的扩展性能。