Optimal reconstruction of a source sequence from multiple noisy traces corrupted by random insertions, deletions, and substitutions typically requires joint processing of all traces, leading to computational complexity that grows exponentially with the number of traces. In this work, we propose an iterative belief-combining procedure that computes symbol-wise a posteriori probabilities by propagating trace-wise inferences via message passing. We prove that, upon convergence, our method achieves the same reconstruction performance as joint maximum a posteriori estimation, while reducing the complexity to quadratic in the number of traces. This performance equivalence is validated using a real-world dataset of clustered short-strand DNA reads.
翻译:从受随机插入、删除和替换干扰的多个含噪迹中优化重构源序列,通常需要对所有迹进行联合处理,导致计算复杂度随迹的数量呈指数级增长。本文提出一种迭代式信念组合方法,该方法通过消息传递传播各迹的推断结果,逐符号计算后验概率。我们证明,在收敛条件下,本方法能达到与联合最大后验估计相同的重构性能,同时将复杂度降低至迹数量的二次方。这一性能等价性通过一个真实世界的短链DNA读取聚类数据集得到了验证。