Block classical Gram-Schmidt (BCGS) is commonly used for orthogonalizing a set of vectors $X$ in distributed computing environments due to its favorable communication properties relative to other orthogonalization approaches, such as modified Gram-Schmidt or Householder. However, it is known that BCGS (as well as recently developed low-synchronization variants of BCGS) can suffer from a significant loss of orthogonality in finite-precision arithmetic, which can contribute to instability and inaccurate solutions in downstream applications such as $s$-step Krylov subspace methods. A common solution to improve the orthogonality among the vectors is reorthogonalization. Focusing on the "Pythagorean" variant of BCGS, introduced in [E. Carson, K. Lund, & M. Rozlo\v{z}n\'{i}k. SIAM J. Matrix Anal. Appl. 42(3), pp. 1365--1380, 2021], which guarantees an $O(\varepsilon)\kappa^2(X)$ bound on the loss of orthogonality as long as $O(\varepsilon)\kappa^2(X)<1$, where $\varepsilon$ denotes the unit roundoff, we introduce and analyze two reorthogonalized Pythagorean BCGS variants. These variants feature favorable communication properties, with asymptotically two synchronization points per block column, as well as an improved $O(\varepsilon)$ bound on the loss of orthogonality. Our bounds are derived in a general fashion to additionally allow for the analysis of mixed-precision variants. We verify our theoretical results with a panel of test matrices and experiments from a new version of the \texttt{BlockStab} toolbox.
翻译:[translated abstract in Chinese]
分块经典Gram-Schmidt(BCGS)常用于在分布式计算环境中正交化向量集合$X$,因其相比其他正交化方法(如改进的Gram-Schmidt或Householder方法)具有更优的通信特性。然而,已知BCGS(以及近期开发的低同步变体BCGS)在有限精度算术中可能遭受严重的正交性损失,这会导致下游应用(例如$s$步Krylov子空间方法)的不稳定和不精确解。提高向量间正交性的常用解决方案是重新正交化。本文聚焦于BCGS的"Pythagorean"变体——该变体由E. Carson、K. Lund与M. Rozlo\v{z}n\'{i}k于2021年提出(SIAM J. Matrix Anal. Appl. 42(3), pp. 1365--1380),其保证在$O(\varepsilon)\kappa^2(X)<1$条件下正交性损失满足$O(\varepsilon)\kappa^2(X)$上界,其中$\varepsilon$表示单位舍入误差——我们引入并分析了两种重新正交化的Pythagorean BCGS变体。这些变体具有优越的通信特性,每块列渐近仅需两个同步点,同时正交性损失改进为$O(\varepsilon)$上界。我们的理论推导采用通用框架,可进一步分析混合精度变体。我们通过一组测试矩阵及新版\texttt{BlockStab}工具箱的实验验证了理论结果。