We study the construction of $d$-deletion-correcting binary codes by formulating the problem as a Maximum Clique Problem (MCP). In this formulation, vertices represent candidate codewords and edges connect pairs whose longest common subsequence (LCS) distance guarantees correction of up to $d$ deletions. A valid codebook corresponds to a clique in the resulting graph, and finding the largest codebook is equivalent to identifying a maximum clique. While MCP-based formulations for deletion-correcting codes have previously been explored, we demonstrate that applying Penalty-Guided Clique Search (PGCS), a lightweight stochastic clique-search heuristic inspired by Dynamic Local Search (DLS), consistently yields larger codebooks than existing graph-based heuristics, including minimum-degree and coloring methods, for block lengths $n = 8,9,\dots,14$ and deletion parameters $d = 1,2,3$. In several finite-length regimes, the resulting codebooks match known optimal sizes and outperform classical constructions such as Helberg codes. For decoding under segmented reception, where codeword boundaries are known, we propose an optimized LCS-based decoder that exploits symbol-count filtering and early termination to substantially reduce the number of LCS evaluations while preserving exact decoding guarantees. These optimizations lead to significantly lower average-case decoding complexity than the baseline $O(|C| n^2)$ approach.
翻译:本文通过将问题表述为最大团问题(MCP)来研究 $d$ 删除纠错二进制码的构造。在此表述中,顶点表示候选码字,边连接其最长公共子序列(LCS)距离能保证纠正最多 $d$ 个删除的码字对。有效的码本对应于所得图中的一个团,而寻找最大码本等价于识别最大团。虽然基于 MCP 的删除纠错码表述先前已有研究,但我们证明,应用惩罚引导团搜索(PGCS)——一种受动态局部搜索(DLS)启发的轻量级随机团搜索启发式算法——在码块长度 $n = 8,9,\dots,14$ 和删除参数 $d = 1,2,3$ 的情况下,始终能产生比现有基于图的启发式算法(包括最小度和着色方法)更大的码本。在多个有限长度范围内,所得码本达到了已知的最优大小,并优于 Helberg 码等经典构造。针对分段接收(已知码字边界)下的解码,我们提出了一种优化的基于 LCS 的解码器,该解码器利用符号计数过滤和提前终止技术,在保持精确解码保证的同时,显著减少了 LCS 评估次数。这些优化使得平均情况下的解码复杂度远低于基准的 $O(|C| n^2)$ 方法。