MSR codes with linear field size and smallest sub-packetization for any number of helper nodes

An $(n,k,\ell)$ array code has $k$ information coordinates and $r = n-k$ parity coordinates, where each coordinate is a vector in $\mathbb{F}_q^{\ell}$ for some field $\mathbb{F}_q$. An $(n,k,\ell)$ MDS array code has the additional property that any $k$ out of $n$ coordinates suffice to recover the whole codeword. Dimakis et al. considered the problem of repairing the erasure of a single coordinate and proved a lower bound on the amount of data transmission that is needed for the repair. A minimum storage regenerating (MSR) array code with repair degree $d$ is an MDS array code that achieves this lower bound for the repair of any single erased coordinate from any $d$ out of $n-1$ remaining coordinates. An MSR code has the optimal access property if the amount of accessed data is the same as the amount of transmitted data in the repair procedure. The sub-packetization $\ell$ and the field size $q$ are of paramount importance in the MSR array code constructions. For optimal-access MSR codes, Balaji et al. proved that $\ell\geq s^{\left\lceil n/s \right\rceil}$, where $s = d-k+1$. Rawat et al. showed that this lower bound is attainable for all admissible values of $d$ when the field size is exponential in $n$. After that, tremendous efforts have been devoted to reducing the field size. However, till now, reduction to linear field size is only available for $d\in\{k+1,k+2,k+3\}$ and $d=n-1$. In this paper, we construct optimal-access MSR codes with linear field size and smallest sub-packetization $\ell = s^{\left\lceil n/s \right\rceil}$ for all $d$ between $k+1$ and $n-1$. We also construct another class of MSR codes that are not optimal-access but have even smaller sub-packetization $s^{\left\lceil n/(s+1)\right\rceil }$. The second class also has linear field size and works for all admissible values of $d$.

翻译：$(n,k,\ell)$数组码包含$k$个信息坐标和$r=n-k$个校验坐标，其中每个坐标是域$\mathbb{F}_q^{\ell}$上的向量。$(n,k,\ell)$ MDS数组码具有额外性质：任意$k$个坐标足以恢复整个码字。Dimakis等人研究了单坐标擦除修复问题，并给出了修复所需数据传输量的下界。修复度为$d$的最小存储再生（MSR）数组码是一类MDS数组码，能够在任意$d$个剩余坐标（共$n-1$个）中实现单坐标擦除修复的下界。若修复过程中访问数据量等于传输数据量，则称该MSR码具有最优访问特性。子分组化$\ell$与域大小$q$是MSR数组码构建中的核心参数。针对最优访问MSR码，Balaji等人证明$\ell\geq s^{\left\lceil n/s \right\rceil}$，其中$s=d-k+1$。Rawat等人表明，当域大小随$n$指数增长时，该下界对所有可行$d$值均可达到。此后大量研究致力于缩小域大小，但截至目前，线性域大小的实现仅限于$d\in\{k+1,k+2,k+3\}$及$d=n-1$的情形。本文针对$k+1$到$n-1$之间的所有$d$值，构建了具有线性域大小与最小子分组化$\ell = s^{\left\lceil n/s \right\rceil}$的最优访问MSR码。同时，我们构造了另一类非最优访问MSR码，其子分组化进一步降低至$s^{\left\lceil n/(s+1)\right\rceil }$，此类码同样满足线性域大小并适用于所有可行$d$值。