Non-overlapping codes are block codes that have arisen in diverse contexts of computer science and biology. Applications typically require finding non-overlapping codes with large cardinalities, but the maximum size of non-overlapping codes has been determined only for cases where the codeword length divides the size of the alphabet, and for codes with codewords of length two or three. For all other alphabet sizes and codeword lengths no computationally feasible way to identify non-overlapping codes that attain the maximum size has been found to date. Herein we characterize maximal non-overlapping codes. We formulate the maximum non-overlapping code problem as an integer optimization problem and determine necessary conditions for optimality of a non-overlapping code. Moreover, we solve several instances of the optimization problem to show that the hitherto known constructions do not generate the optimal codes for many alphabet sizes and codeword lengths. We also evaluate the number of distinct maximum non-overlapping codes.
翻译:非重叠码是一类在计算机科学和生物学的多种应用背景下产生的分组码。实际应用通常需要寻找具有大基数的非重叠码,然而非重叠码的最大规模仅在码字长度整除字母表大小以及码字长度为二或三的情况下被确定。对于其他字母表大小和码字长度,目前尚未找到任何计算上可行的方法来识别达到最大规模的非重叠码。本文对极大非重叠码进行了刻画。我们将最大非重叠码问题表述为整数优化问题,并确定了非重叠码最优性的必要条件。此外,我们求解了该优化问题的若干实例,以证明现有构造方法在许多字母表大小和码字长度下并不能生成最优码。我们还评估了不同最大非重叠码的数量。