This paper studies the problem of encoding messages into sequences which can be uniquely recovered from some noisy observations about their substrings. The observed reads comprise consecutive substrings with some given minimum overlap. This coded reconstruction problem has applications to DNA storage. We consider both single-strand reconstruction codes and multi-strand reconstruction codes, where the message is encoded into a single strand or a set of multiple strands, respectively. Various parameter regimes are studied. New codes are constructed, some of whose rates asymptotically attain the upper bounds.
翻译:本文研究将消息编码为序列的问题,这些序列能够通过其子串的含噪观测被唯一重构。观测数据包含具有给定最小重叠的连续子串。该编码重构问题在DNA存储中具有应用价值。我们同时考虑了单链重构码和多链重构码,其中消息分别被编码为单链或多链集合。研究涵盖了多种参数范围。本文构造了新型码字,其中部分码字速率渐近达到上界。