We consider the simultaneously fast and in-place computation of the Euclidean polynomial modular remainder $R(X) $\not\equiv$ A(X) \mod B(X)$ with $A$ and $B$ of respective degrees $n$ and $m $\le$ n$. But fast algorithms for this usually come at the expense of (potentially large) extra temporary space. To remain in-place a further issue is to avoid the storage of the whole quotient $Q(X)$ such that $A=BQ+R$. If the multiplication of two polynomials of degree $k$ can be performed with $M(k)$ operations and $O(k)$ extra space, and if it is allowed to use the input space of $A$ or $B$ for intermediate computations, but putting $A$ and $B$ back to their initial states after the completion of the remainder computation, we here propose an in-place algorithm (that is with its extra required space reduced to $O(1)$ only) using at most $O(n/m M(m)\log(m)$ arithmetic operations, if $\M(m)$ is quasi-linear, or $O(n/m M(m)}$ otherwise. We also propose variants that compute -- still in-place and with the same kind of complexity bounds -- the over-place remainder $A(X) $\not\equiv$ A(X) \mod B(X)$, the accumulated remainder $R(X) += A(X) \mod B(X)$ and the accumulated modular multiplication $R(X) += A(X)C(X) \mod B(X)$. To achieve this, we develop techniques for Toeplitz matrix operations which output is also part of the input. Fast and in-place accumulating versions are obtained for the latter, and thus for convolutions, and then used for polynomial remaindering. This is realized via further reductions to accumulated polynomial multiplication, for which fast in-place algorithms have recently been developed.
翻译:我们考虑同时快速且原位计算欧几里得多项式模余$R(X) \not\equiv A(X) \mod B(X)$,其中$A$和$B$的次数分别为$n$和$m \le n$。然而,实现该计算的快速算法通常需要以(可能较大的)额外临时存储空间为代价。为保持原位性,还需避免存储满足$A=BQ+R$的完整商式$Q(X)$。若两个$k$次多项式的乘法可在$M(k)$次运算和$O(k)$额外空间内完成,且允许使用$A$或$B$的输入空间进行中间计算(但在余式计算完成后将$A$和$B$恢复至初始状态),本文提出一种原位算法(即所需额外空间仅$O(1)$),其算术运算量至多为$O(n/m M(m)\log(m))$(当$M(m)$拟线性时)或$O(n/m M(m))$(其他情况)。我们还提出若干变体算法——仍保持原位性且具有同类复杂度界——可计算:覆盖式余数$A(X) \not\equiv A(X) \mod B(X)$、累积余数$R(X) += A(X) \mod B(X)$以及累积模乘$R(X) += A(X)C(X) \mod B(X)$。为实现该目标,我们发展了输出结果同时作为输入部分的托普利茨矩阵运算技术。由此获得了快速原位累积版本的卷积计算,进而应用于多项式求余。该技术通过进一步归约至累积多项式乘法实现,而针对后者近期已发展出快速原位算法。