Block classical Gram-Schmidt (BCGS) is commonly used for orthogonalizing a set of vectors $X$ in distributed computing environments due to its favorable communication properties relative to other orthogonalization approaches, such as modified Gram-Schmidt or Householder. However, it is known that BCGS (as well as recently developed low-synchronization variants of BCGS) can suffer from a significant loss of orthogonality in finite-precision arithmetic, which can contribute to instability and inaccurate solutions in downstream applications such as $s$-step Krylov subspace methods. A common solution to improve the orthogonality among the vectors is reorthogonalization. Focusing on the "Pythagorean" variant of BCGS, introduced in [E. Carson, K. Lund, & M. Rozlo\v{z}n\'{i}k. SIAM J. Matrix Anal. Appl. 42(3), pp. 1365--1380, 2021], which guarantees an $O(\varepsilon)\kappa^2(X)$ bound on the loss of orthogonality as long as $O(\varepsilon)\kappa^2(X)<1$, where $\varepsilon$ denotes the unit roundoff, we introduce and analyze two reorthogonalized Pythagorean BCGS variants. These variants feature favorable communication properties, with asymptotically two synchronization points per block column, as well as an improved $O(\varepsilon)$ bound on the loss of orthogonality. Our bounds are derived in a general fashion to additionally allow for the analysis of mixed-precision variants. We verify our theoretical results with a panel of test matrices and experiments from a new version of the \texttt{BlockStab} toolbox.
翻译:块经典Gram-Schmidt(BCGS)因其相较于其他正交化方法(如修正Gram-Schmidt或Householder变换)更优的通信特性,常被用于分布式计算环境中对向量组$X$进行正交化。然而,已知BCGS(以及近期开发的低同步BCGS变体)在有限精度算术中可能遭受显著的正交性损失,这可能导致下游应用(如$s$步Krylov子空间方法)的不稳定性和解的不准确性。改善向量间正交性的常用解决方案是重正交化。聚焦于[E. Carson, K. Lund, & M. Rozložník. SIAM J. Matrix Anal. Appl. 42(3), pp. 1365--1380, 2021]中引入的BCGS“勾股”变体——该变体只要满足$O(\varepsilon)\kappa^2(X)<1$(其中$\varepsilon$表示单位舍入误差),即可保证正交性损失的界为$O(\varepsilon)\kappa^2(X)$——本文引入并分析了两种重正交化的勾股BCGS变体。这些变体具备良好的通信特性,每块列具有渐近两个同步点,同时正交性损失具有改进的$O(\varepsilon)$界。我们的界以通用方式推导,从而也允许分析混合精度变体。我们使用一组测试矩阵和来自新版\texttt{BlockStab}工具箱的实验验证了理论结果。