In this paper, we investigate the problem of designing $(n, N; \mathcal{B})$-reconstruction codes for $N\in \{14,11,9,5\}$, where $\mathcal{B}$ is the single-deletion single-substitution ball function that maps a sequence to the set of all sequences obtainable via one deletion and one substitution. Such a code is defined by the requirement that the intersection size of any two distinct single-deletion single-substitution balls is strictly less than the given number of noisy reads $N$. Note that for any $1\le N<N'$, an $(n, N; \mathcal{B})$-reconstruction code is also an $(n, N'; \mathcal{B})$-reconstruction code. It follows that the problem of designing $(n, N; \mathcal{B})$-reconstruction codes with less redundancy becomes more challenging as $N$ decreases, particularly because the problem for $N=1$ already reduces to the coding problem of single-deletion and single-substitution correcting codes. To the best of our knowledge, most existing results focus on the case where $N$ is a linear function of $n$, while only a limited number consider constant $N$. When $N=1$, the best known $(n, 1; \mathcal{B})$-reconstruction codes (single-deletion and single-substitution correcting codes) require $(4+o(1))\log n$ redundant bits. In this work, we show that this redundancy can be reduced to $3\log n+4$ when $N=5$. As $N$ increases further to $9$ and $11$, the redundancy can be improved to $2\log n+12\log\log n+O(1)$ and $\log n +12\log \log n+O(1)$, respectively. Finally, for $N=14$, we provide a reconstruction code with $\log n+3$ bits of redundancy, which is only two bits more than the best known $(n, 18; \mathcal{B})$-reconstruction codes.
翻译:本文研究了在$N\in \{14,11,9,5\}$条件下设计$(n, N; \mathcal{B})$-重建码的问题,其中$\mathcal{B}$是单删除单替换球函数,该函数将序列映射为通过一次删除和一次替换可得到的所有序列的集合。此类码的定义要求任意两个不同单删除单替换球的交集大小严格小于给定的带噪读取次数$N$。注意,对于任意$1\le N<N'$,$(n, N; \mathcal{B})$-重建码同时也是$(n, N'; \mathcal{B})$-重建码。因此,随着$N$减小,设计冗余更低的$(n, N; \mathcal{B})$-重建码问题更具挑战性,特别是当$N=1$时已退化为单删除和单替换纠正码的编码问题。据我们所知,现有结果大多关注$N$为$n$线性函数的情形,而针对常数$N$的研究有限。当$N=1$时,已知最优的$(n, 1; \mathcal{B})$-重建码(即单删除单替换纠正码)需要$(4+o(1))\log n$冗余比特。本文证明,当$N=5$时该冗余可降至$3\log n+4$。随着$N$进一步增至$9$和$11$,冗余可分别改进至$2\log n+12\log\log n+O(1)$和$\log n +12\log \log n+O(1)$。最后,针对$N=14$,我们提供了仅需$\log n+3$比特冗余的重建码,比已知最优的$(n, 18; \mathcal{B})$-重建码多仅两个比特。