We study error-correcting codes in the space $\mathcal{S}_{n,q}$ of length-$n$ multisets over a $q$-ary alphabet, motivated by permutation channels in which ordering is completely lost and errors act solely by deletions of symbols, i.e., by reducing symbol multiplicities. Our focus is on the \emph{extremal deletion regime}, where the channel output contains $k=n-t$ symbols. In this regime, we establish tight or near-tight bounds on the maximum code size. In particular, we determine the exact optimal code sizes for $t=n-1$ and for $t=n-2$, develop a refined analysis for $t=n-3$, and derive a general recursive puncturing upper bound for $t=n-k$ via a reduction from parameters $(n,k)$ to $(n-1,k-1)$. On the constructive side, we completely resolve the binary multiset model: for all $t\ge1$ we determine $S_2(n,t)$ exactly and give an explicit optimal congruence-based construction. We then study single-deletion codes beyond the binary case, presenting general $q$-ary constructions and showing, via explicit small-parameter examples, that the natural modular construction need not be optimal for $q\ge3$. Finally, we present an explicit cyclic Sidon-type linear construction for general $(q,t)$ based on a single congruence constraint, with redundancy $\log_q\!\bigl(t(t+1)^{q-2}+1\bigr)$ and encoding and decoding complexity linear in the blocklength $n$.
翻译:我们研究$q$元字母表上长度为$n$的多重集空间$\mathcal{S}_{n,q}$中的纠错码,其动机源于排序信息完全丢失且错误仅通过符号删除(即降低符号重数)起作用的置换信道。我们聚焦于\emph{极端删除机制},其中信道输出包含$k=n-t$个符号。在此机制下,我们建立了关于最大码尺寸的紧界或近紧界。具体而言,我们确定了$t=n-1$和$t=n-2$时的精确最优码尺寸,对$t=n-3$情形进行了精细化分析,并通过从参数$(n,k)$到$(n-1,k-1)$的归约,推导出$t=n-k$时的一般递归删除上界。在构造方面,我们完全解决了二元多重集模型:对所有$t\ge1$精确确定了$S_2(n,t)$并给出了一种显式的最优同余构造。随后我们研究了二元情形外的单删除码,提出了通用的$q$元构造,并通过显式的小参数示例表明,对于$q\ge3$,自然的模构造未必最优。最后,我们基于单一同余约束提出了一种显式的循环Sidon型线性构造,适用于一般$(q,t)$参数,其冗余度为$\log_q\!\bigl(t(t+1)^{q-2}+1\bigr)$,且编码与解码复杂度均与分组长度$n$呈线性关系。