In this paper, we consider a general notion of convolution. Let $D$ be a finite domain and let $D^n$ be the set of $n$-length vectors (tuples) of $D$. Let $f : D \times D \to D$ be a function and let $\oplus_f$ be a coordinate-wise application of $f$. The $f$-Convolution of two functions $g,h : D^n \to \{-M,\ldots,M\}$ is $$(g \otimes_f h)(\textbf{v}) := \sum_{\substack{\textbf{v}_g,\textbf{v}_h \in D^n\\ \text{s.t. } \textbf{v}_g \oplus_f \textbf{v}_h}} g(\textbf{v}_g) \cdot h(\textbf{v}_h)$$ for every $\textbf{v} \in D^n$. This problem generalizes many fundamental convolutions such as Subset Convolution, XOR Product, Covering Product or Packing Product, etc. For arbitrary function $f$ and domain $D$ we can compute $f$-Convolution via brute-force enumeration in $\widetilde{O}(|D|^{2n}\mathrm{polylog}(M))$ time. Our main result is an improvement over this naive algorithm. We show that $f$-Convolution can be computed exactly in $\widetilde{O}((c \cdot |D|^2)^{n}\mathrm{polylog}(M))$ for constant $c := 3/4$ when $D$ has even cardinality. Our main observation is that a \emph{cyclic partition} of a function $f : D \times D \to D$ can be used to speed up the computation of $f$-Convolution, and we show that an appropriate cyclic partition exists for every $f$. Furthermore, we demonstrate that a single entry of the $f$-Convolution can be computed more efficiently. In this variant, we are given two functions $g,h : D^n \to \{-M,\ldots,M\}$ alongside with a vector $\textbf{v} \in D^n$ and the task of the $f$-Query problem is to compute integer $(g \otimes_f h)(\textbf{v})$. This is a generalization of the well-known Orthogonal Vectors problem. We show that $f$-Query can be computed in $\widetilde{O}(|D|^{\frac{\omega}{2} n}\mathrm{polylog}(M))$ time, where $\omega \in [2,2.372)$ is the exponent of currently fastest matrix multiplication algorithm.
翻译:本文研究一类广义卷积运算。设$D$为有限定义域,$D^n$表示所有$n$维向量(元组)构成的集合。令$f: D \times D \to D$为任意函数,$\oplus_f$表示对向量各分量逐位应用函数$f$。对于两个函数$g,h: D^n \to \{-M,\ldots,M\}$,其$f$-卷积定义为:对任意$\textbf{v} \in D^n$,有$$(g \otimes_f h)(\textbf{v}) := \sum_{\substack{\textbf{v}_g,\textbf{v}_h \in D^n\\ \text{满足 } \textbf{v}_g \oplus_f \textbf{v}_h}} g(\textbf{v}_g) \cdot h(\textbf{v}_h)$$该问题推广了子集卷积、XOR乘积、覆盖乘积、包装乘积等众多基础卷积运算。对于任意函数$f$和定义域$D$,通过暴力枚举可在$\widetilde{O}(|D|^{2n}\mathrm{polylog}(M))$时间内计算$f$-卷积。本文的主要结果是对该朴素算法的改进:当$D$的基数为偶数时,我们可在$\widetilde{O}((c \cdot |D|^2)^{n}\mathrm{polylog}(M))$时间内精确计算$f$-卷积,其中常数$c := 3/4$。我们的核心观测是:函数$f: D \times D \to D$的\emph{循环划分}可用于加速$f$-卷积计算,并证明对任意$f$均存在适当的循环划分。进一步研究表明,$f$-卷积的单个条目可更高效地计算。在此变体中,给定两个函数$g,h: D^n \to \{-M,\ldots,M\}$及向量$\textbf{v} \in D^n$,$f$-查询问题要求计算整数$(g \otimes_f h)(\textbf{v})$。该问题推广了经典的 Orthogonal Vectors 问题。我们证明$f$-查询可在$\widetilde{O}(|D|^{\frac{\omega}{2} n}\mathrm{polylog}(M))$时间内完成,其中$\omega \in [2,2.372)$为当前最快矩阵乘法算法的指数。