Non-overlapping codes are a set of codewords such that the prefix of each codeword is not a suffix of any codeword in the set, including itself. If the lengths of the codewords are variable, it is additionally required that every codeword is not contained in any other codeword as a subword. Let $C(n,q)$ be the maximum size of $q$-ary fixed-length non-overlapping codes of length $n$. The upper bound on $C(n,q)$ has been well studied. However, the nontrivial upper bound on the maximum size of variable-length non-overlapping codes of length at most $n$ remains open. In this paper, by establishing a link between variable-length non-overlapping codes and fixed-length ones, we are able to show that the size of a $q$-ary variable-length non-overlapping code is upper bounded by $C(n,q)$. Furthermore, we prove that the average length of the codewords in a $q$-ary variable-length non-overlapping codes is lower bounded by $\lceil \log_q \tilde{C} \rceil$, and is asymptotically no shorter than $n-2$ as $q$ approaches $\infty$, where $\tilde{C}$ denotes the cardinality of $q$-ary variable-length non-overlapping codes of length up to $n$.
翻译:无重叠码是一组码字,其中每个码字的前缀都不是该集合中任何码字(包括自身)的后缀。若码字长度可变,则还需额外要求每个码字不能作为子词包含在其他码字中。设$C(n,q)$为长度为$n$的$q$元定长无重叠码的最大尺寸。关于$C(n,q)$的上界已有深入研究。然而,长度不超过$n$的变长无重叠码最大尺寸的非平凡上界仍悬而未决。本文通过建立变长无重叠码与定长无重叠码之间的关联,证明$q$元变长无重叠码的尺寸受限于$C(n,q)$。进一步,我们证明$q$元变长无重叠码中码字的平均长度下界为$\lceil \log_q \tilde{C} \rceil$,且当$q$趋于无穷时,该平均长度渐近不小于$n-2$,其中$\tilde{C}$表示长度不超过$n$的$q$元变长无重叠码的基数。