We consider the simultaneously fast and in-place computation of the Euclidean polynomial modular remainder $R(X) $\not\equiv$ A(X) \mod B(X)$ with $A$ and $B$ of respective degrees $n$ and $m $\le$ n$. But fast algorithms for this usually come at the expense of (potentially large) extra temporary space. To remain in-place a further issue is to avoid the storage of the whole quotient $Q(X)$ such that $A=BQ+R$. If the multiplication of two polynomials of degree $k$ can be performed with $M(k)$ operations and $O(k)$ extra space, and if it is allowed to use the input space of $A$ or $B$ for intermediate computations, but putting $A$ and $B$ back to their initial states after the completion of the remainder computation, we here propose an in-place algorithm (that is with its extra required space reduced to $O(1)$ only) using at most $O(n/m M(m)\log(m)$ arithmetic operations, if $\M(m)$ is quasi-linear, or $O(n/m M(m)}$ otherwise. We also propose variants that compute -- still in-place and with the same kind of complexity bounds -- the over-place remainder $A(X) $\not\equiv$ A(X) \mod B(X)$, the accumulated remainder $R(X) += A(X) \mod B(X)$ and the accumulated modular multiplication $R(X) += A(X)C(X) \mod B(X)$. To achieve this, we develop techniques for Toeplitz matrix operations which output is also part of the input. Fast and in-place accumulating versions are obtained for the latter, and thus for convolutions, and then used for polynomial remaindering. This is realized via further reductions to accumulated polynomial multiplication, for which fast in-place algorithms have recently been developed.
翻译:我们研究欧几里得多项式模余式 $R(X) \equiv A(X) \mod B(X)$ 的同步快速与原地计算问题,其中 $A$ 与 $B$ 的次数分别为 $n$ 和 $m \le n$。然而,现有的快速算法通常以(潜在较大的)额外临时空间为代价。为实现原地计算,另一难点在于避免存储完整的商式 $Q(X)$(满足 $A = BQ + R$)。若次数为 $k$ 的两多项式相乘需 $M(k)$ 次运算及 $O(k)$ 额外空间,且允许使用 $A$ 或 $B$ 的输入空间进行中间计算(但完成余式计算后需将 $A$ 与 $B$ 恢复至初始状态),本文提出一种仅需 $O(1)$ 额外空间的原地算法,其算术运算次数至多为 $O(n/m \cdot M(m) \log m)$(若 $M(m)$ 为拟线性)或 $O(n/m \cdot M(m))$(否则)。我们还提出变体算法,可原地计算(保持相同复杂度界)原位余式 $A(X) \equiv A(X) \mod B(X)$、累加余式 $R(X) += A(X) \mod B(X)$ 以及累加模乘 $R(X) += A(X)C(X) \mod B(X)$。为此,我们发展了输出同时作为输入的Toeplitz矩阵运算技术,进而获得累加版本的快速原地算法(适用于卷积运算),并将其应用于多项式求余。上述结果通过进一步归约至累加多项式乘法实现,而后者已有近期发展的快速原地算法支持。