Linear Insertion Deletion Codes in the High-Noise and High-Rate Regimes

This work continues the study of linear error correcting codes against adversarial insertion deletion errors (insdel errors). Previously, the work of Cheng, Guruswami, Haeupler, and Li \cite{CGHL21} showed the existence of asymptotically good linear insdel codes that can correct arbitrarily close to $1$ fraction of errors over some constant size alphabet, or achieve rate arbitrarily close to $1/2$ even over the binary alphabet. As shown in \cite{CGHL21}, these bounds are also the best possible. However, known explicit constructions in \cite{CGHL21}, and subsequent improved constructions by Con, Shpilka, and Tamo \cite{9770830} all fall short of meeting these bounds. Over any constant size alphabet, they can only achieve rate $< 1/8$ or correct $< 1/4$ fraction of errors; over the binary alphabet, they can only achieve rate $< 1/1216$ or correct $< 1/54$ fraction of errors. Apparently, previous techniques face inherent barriers to achieve rate better than $1/4$ or correct more than $1/2$ fraction of errors. In this work we give new constructions of such codes that meet these bounds, namely, asymptotically good linear insdel codes that can correct arbitrarily close to $1$ fraction of errors over some constant size alphabet, and binary asymptotically good linear insdel codes that can achieve rate arbitrarily close to $1/2$.\ All our constructions are efficiently encodable and decodable. Our constructions are based on a novel approach of code concatenation, which embeds the index information implicitly into codewords. This significantly differs from previous techniques and may be of independent interest. Finally, we also prove the existence of linear concatenated insdel codes with parameters that match random linear codes, and propose a conjecture about linear insdel codes.

翻译：本文继续研究对抗对抗性插入删除错误（insdel错误）的线性纠错码。此前，Cheng、Guruswami、Haeupler和Li的工作\cite{CGHL21}证明了渐近良好的线性insdel码的存在性，这些码可在某个常数大小字母表上纠正任意接近$1$的错误比例，或在二进制字母表上达到任意接近$1/2$的速率。如\cite{CGHL21}所示，这些界也是最优的。然而，已知的显式构造（出自\cite{CGHL21}）以及Con、Shpilka和Tamo\cite{9770830}后续改进的构造均未能达到这些界。在任意常数大小字母表上，它们只能达到速率$<1/8$或纠正错误比例$<1/4$；在二进制字母表上，只能达到速率$<1/1216$或纠正错误比例$<1/54$。显然，先前技术面临速率超过$1/4$或纠正错误比例超过$1/2$的固有障碍。本文提出了达到这些界的新构造，即在某个常数大小字母表上可纠正任意接近$1$错误比例的渐近良好线性insdel码，以及可达任意接近$1/2$速率的二进制渐近良好线性insdel码。所有构造均可高效编码与解码。我们的构造基于一种新颖的级联码方法，通过将索引信息隐式嵌入码字中，这与先前技术显著不同，可能具有独立价值。最后，我们证明了存在参数与随机线性码匹配的线性级联insdel码，并提出关于线性insdel码的一个猜想。