We present a finite blocklength performance bound for a DNA storage channel with insertions, deletions, and substitutions. The considered bound -- the dependency testing (DT) bound, introduced by Polyanskiy et al. in 2010 -- provides an upper bound on the achievable frame error probability and can be used to benchmark coding schemes in the practical short-to-medium blocklength regime. In particular, we consider a concatenated coding scheme where an inner synchronization code deals with insertions and deletions and the outer code corrects remaining (mostly substitution) errors. The bound depends on the inner synchronization code. Thus, it allows to guide its choice. We then consider low-density parity-check codes for the outer code, which we optimize based on extrinsic information transfer charts. Our optimized coding schemes achieve a normalized rate of $88\%$ to $96\%$ with respect to the DT bound for code lengths up to $2000$ DNA symbols for a frame error probability of $10^{-3}$ and code rate 1/2.
翻译:本文提出了针对具有插入、删除和替代错误的DNA存储信道的有限块长性能界。所考虑的界——由Polyanskiy等人于2010年提出的依赖检测界——给出了可达帧错误概率的上界,可用于评估实用短至中块长场景下的编码方案。特别地,我们考虑一种级联编码方案:内层同步码处理插入和删除错误,外层码校正剩余(主要为替代)错误。该界依赖于内层同步码的选取,从而可指导其选择。随后我们采用低密度奇偶校验码作为外层码,并基于外信息转移图对其进行优化。针对码长为2000个DNA符号、帧错误概率为$10^{-3}$且码率为1/2的场景,我们优化后的编码方案相对于依赖检测界的归一化码率可达$88\%$至$96\%$。