Multiset Deletion-Correcting Codes: Bounds and Constructions

We study error-correcting codes in the space $\mathcal{S}_{n,q}$ of length-$n$ multisets over a $q$-ary alphabet, motivated by permutation channels in which ordering is completely lost and errors act solely by deletions of symbols, i.e., by reducing symbol multiplicities. Our focus is on the \emph{extremal deletion regime}, where the channel output contains $k=n-t$ symbols. In this regime, we establish tight or near-tight bounds on the maximum code size. In particular, we determine the exact optimal code sizes for $t=n-1$ and for $t=n-2$, develop a refined analysis for $t=n-3$, and derive a general recursive puncturing upper bound for $t=n-k$ via a reduction from parameters $(n,k)$ to $(n-1,k-1)$. On the constructive side, we completely resolve the binary multiset model: for all $t\ge1$ we determine $S_2(n,t)$ exactly and give an explicit optimal congruence-based construction. We then study single-deletion codes beyond the binary case, presenting general $q$-ary constructions and showing, via explicit small-parameter examples, that the natural modular construction need not be optimal for $q\ge3$. Finally, we present an explicit cyclic Sidon-type linear construction for general $(q,t)$ based on a single congruence constraint, with redundancy $\log_q\!\bigl(t(t+1)^{q-2}+1\bigr)$ and encoding and decoding complexity linear in the blocklength $n$.

翻译：我们研究$q$元字母表上长度为$n$的多重集空间$\mathcal{S}_{n,q}$中的纠错码，其动机源于排序信息完全丢失且错误仅通过符号删除（即降低符号重数）起作用的置换信道。我们聚焦于\emph{极端删除机制}，其中信道输出包含$k=n-t$个符号。在此机制下，我们建立了关于最大码尺寸的紧界或近紧界。具体而言，我们确定了$t=n-1$和$t=n-2$时的精确最优码尺寸，对$t=n-3$情形进行了精细化分析，并通过从参数$(n,k)$到$(n-1,k-1)$的归约，推导出$t=n-k$时的一般递归删除上界。在构造方面，我们完全解决了二元多重集模型：对所有$t\ge1$精确确定了$S_2(n,t)$并给出了一种显式的最优同余构造。随后我们研究了二元情形外的单删除码，提出了通用的$q$元构造，并通过显式的小参数示例表明，对于$q\ge3$，自然的模构造未必最优。最后，我们基于单一同余约束提出了一种显式的循环Sidon型线性构造，适用于一般$(q,t)$参数，其冗余度为$\log_q\!\bigl(t(t+1)^{q-2}+1\bigr)$，且编码与解码复杂度均与分组长度$n$呈线性关系。

相关内容

多重集

关注 0

在数学中，多重集是对集的概念的修改，与集不同，集对每个元素允许多个实例。为每个元素提供的实例的正整数个数称为该元素在多重集中的多重性。结果存在无限多个多重集，它们仅包含元素a和b，但因元素的多样性而变化：（1）集{a，b}仅包含元素a和b，当将{a，b}视为多集时，每个元素的多重性为1;（2）在多重集{a，a，b}中，元素a具有多重性2，而b具有多重性1;（3）在多集{a，a，a，b，b，b}中，a和b都具有多重性3。

【干货书】代数编码理论导论

专知会员服务

44+阅读 · 2023年9月13日

《分布式多智能体强化学习的编码》加州大学等

专知会员服务

55+阅读 · 2022年11月2日

《使用各种数据生成模型评估量子纠错的神经网络解码器性能》美国空军技术学院142页论文

专知会员服务

12+阅读 · 2022年10月10日

【干货书】信息论与编码，517页pdf

专知会员服务

91+阅读 · 2022年7月20日