Improving the Leading Constant of Matrix Multiplication

Algebraic matrix multiplication algorithms are designed by bounding the rank of matrix multiplication tensors, and then using a recursive method. However, designing algorithms in this way quickly leads to large constant factors: if one proves that the tensor for multiplying $n \times n$ matrices has rank $\leq t$, then the resulting recurrence shows that $M \times M$ matrices can be multiplied using $O(n^2 \cdot M^{\log_n t})$ operations, where the leading constant scales proportionally to $n^2$. Even modest increases in $n$ can blow up the leading constant too much to be worth the slight decrease in the exponent of $M$. Meanwhile, the asymptotically best algorithms use very large $n$, such that $n^2$ is larger than the number of atoms in the visible universe! In this paper, we give new ways to use tensor rank bounds to design matrix multiplication algorithms, which lead to smaller leading constants than the standard recursive method. Our main result shows that, if the tensor for multiplying $n \times n$ matrices has rank $\leq t$, then $M \times M$ matrices can be multiplied using only $n^{O(1/(\log n)^{0.33})} \cdot M^{\log_n t}$ operations. In other words, we improve the leading constant in general from $O(n^2)$ to $n^{O(1/(\log n)^{0.33})} < n^{o(1)}$. We then apply this and further improve the leading constant in a number of situations of interest. We show that, in the popularly-conjectured case where $\omega=2$, a new, different recursive approach can lead to an improvement. We also show that the leading constant of the current asymptotically fastest matrix multiplication algorithm, and any algorithm designed using the group-theoretic method, can be further improved by taking advantage of additional structure of the underlying tensor identities.

翻译：代数矩阵乘法算法的设计通常通过界定矩阵乘法张量的秩，然后采用递归方法来实现。然而，这种设计方式会迅速导致较大的常数因子：如果证明 $n \times n$ 矩阵相乘的张量秩 $\leq t$，则所得递推关系表明 $M \times M$ 矩阵的相乘可以使用 $O(n^2 \cdot M^{\log_n t})$ 次运算完成，其中前导常数与 $n^2$ 成比例增长。即使 $n$ 的适度增加也会使前导常数急剧膨胀，以至于指数 $M$ 的轻微降低得不偿失。同时，渐近最优的算法使用非常大的 $n$，使得 $n^2$ 大于可见宇宙中的原子数量！在本文中，我们提出了利用张量秩界设计矩阵乘法算法的新方法，该方法相较于标准递归方法能产生更小的前导常数。我们的主要结果表明，如果 $n \times n$ 矩阵相乘的张量秩 $\leq t$，则 $M \times M$ 矩阵的相乘仅需 $n^{O(1/(\log n)^{0.33})} \cdot M^{\log_n t}$ 次运算。换言之，我们将一般情况下的前导常数从 $O(n^2)$ 改进至 $n^{O(1/(\log n)^{0.33})} < n^{o(1)}$。随后，我们应用这一结果，并在多种关注情境中进一步改进了前导常数。我们证明，在普遍假设 $\omega=2$ 的情况下，一种新颖且不同的递归方法能够带来改进。我们还表明，当前渐近最快的矩阵乘法算法以及任何基于群论方法设计的算法的前导常数，均可通过利用底层张量恒等式的额外结构得到进一步优化。