We study error-correcting codes in the space $\mathcal{S}_{n,q}$ of length-$n$ multisets over a $q$-ary alphabet under the deletion metric, motivated by permutation channels in which ordering is completely lost and errors act only on symbol multiplicities. We develop two complementary directions. First, we present polynomial Sidon-type constructions over finite fields, in both projective and affine forms, yielding multiset $t$-deletion-correcting codes in the regime $t<q$ with redundancy $t+O(1)$, independent of the blocklength $n$. Second, we develop a geometric analysis of deletion balls in $\mathcal{S}_{n,q}$. Using difference-vector representations together with a diagonal reduction of the relevant generating functions, we derive exact generating-function expressions for individual deletion-ball sizes, exact formulas for the number of ordered pairs of multisets at a fixed distance $m$, and consequently for the average ball size. We prove that radius-$r$ deletion balls are minimized at extreme multisets and maximized at the most balanced multisets, giving a formal global characterization of extremal centers in $\mathcal{S}_{n,q}$. We further relate the maximal-ball value to the ideal difference set $S_{q-1}(r,r)$ through boundary truncation, obtaining explicit closed forms for $q=2$ and $q=3$. These geometric results lead to volume-based bounds on code size, including sphere-packing upper bounds, a boundary-aware analysis of code--anticode arguments, and Gilbert--Varshamov-type lower bounds governed by exact average ball sizes. For fixed $q$ and $t$, the resulting average-ball lower bound matches the interior-difference-set scale asymptotically.
翻译:我们研究在删除度量下,$q$元字母表上长度为$n$的多重集空间$\mathcal{S}_{n,q}$中的纠错码,其动机源于排序完全丢失且误差仅作用于符号多重性的置换信道。我们发展了两种互补的研究方向。首先,我们在有限域上提出多项式Sidon型构造(包括射影形式和仿射形式),在$t<q$参数区域内得到冗余度为$t+O(1)$(与块长$n$无关)的多重集$t$删除纠错码。其次,我们对$\mathcal{S}_{n,q}$中的删除球开展几何分析。通过利用差向量表示以及相关生成函数的对角约化,我们导出了单个删除球大小的精确生成函数表达式、固定距离$m$的多元组有序对数量的精确公式,并据此得到平均球大小。我们证明了半径为$r$的删除球在极端多重集处取最小值,而在最均衡多重集处取最大值,从而给出了$\mathcal{S}_{n,q}$中极值中心的完整全局刻画。进一步通过边界截断,我们将最大球值与理想差集$S_{q-1}(r,r)$建立关联,获得了$q=2$和$q=3$情况下的显式闭合形式。这些几何结果导出了基于体积的码字规模界限,包括球填充上界、码字-反码论证的边界感知分析,以及由精确平均球大小主导的Gilbert-Varshamov型下界。对于固定$q$和$t$,所得的平均球下界渐近地匹配内差集的标度。