We develop lower bounds on communication in the memory hierarchy or between processors for nested bilinear algorithms, such as Strassen's algorithm for matrix multiplication. We build on a previous framework that establishes communication lower bounds by use of the rank expansion, or the minimum rank of any fixed size subset of columns of a matrix, for each of the three matrices encoding a bilinear algorithm. This framework provides lower bounds for a class of dependency directed acyclic graphs (DAGs) corresponding to the execution of a given bilinear algorithm, in contrast to other approaches that yield bounds for specific DAGs. However, our lower bounds only apply to executions that do not compute the same DAG node multiple times. Two bilinear algorithms can be nested by taking Kronecker products between their encoding matrices. Our main result is a lower bound on the rank expansion of a matrix constructed by a Kronecker product derived from lower bounds on the rank expansion of the Kronecker product's operands. We apply the rank expansion lower bounds to obtain novel communication lower bounds for nested Toom-Cook convolution, Strassen's algorithm, and fast algorithms for contraction of partially symmetric tensors.
翻译:我们针对嵌套双线性算法(如Strassen矩阵乘法算法)在内存层级或处理器间的通信建立了下界。研究基于先前框架——通过秩扩展(即矩阵任意固定规模列子集的最小秩)对编码双线性算法的三个矩阵分别进行分析,从而给出通信下界。该框架针对给定双线性算法执行对应的依赖有向无环图(DAG)类提供下界,区别于针对特定DAG的其他方法。然而,我们的下界仅适用于不重复计算相同DAG节点的情况。通过取其编码矩阵的Kronecker积可实现两个双线性算法的嵌套。主要结果为:基于Kronecker积运算元的秩扩展下界,推导出由该积构造的矩阵的秩扩展下界。应用该秩扩展下界,我们获得了嵌套Toom-Cook卷积、Strassen算法以及部分对称张量收缩快速算法的新型通信下界。