DNA-based storage offers unprecedented density and durability, but its scalability is fundamentally limited by the efficiency of parallel strand synthesis. Existing methods either allow unconstrained nucleotide additions to individual strands, such as enzymatic synthesis, or enforce identical additions across many strands, such as photolithographic synthesis. We introduce and analyze a hybrid synthesis framework that generalizes both approaches: in each cycle, a nucleotide is selected from a restricted subset and incorporated in parallel. This model gives rise to a new notion of a complex synthesis sequence. Building on this framework, we extend the information rate definition of Lenz et al. and analyze an analog of the deletion ball, defined and studied in this setting, deriving tight expressions for the maximal information rate and its asymptotic behavior. These results bridge the theoretical gap between constrained models and the idealized setting in which every nucleotide is always available. For the case of known strands, we design a dynamic programming algorithm that computes an optimal complex synthesis sequence, highlighting structural similarities to the shortest common supersequence problem. We also define a distinct two-dimensional array model with synthesis constraints over the rows, which extends previous synthesis models in the literature and captures new structural limitations in large-scale strand arrays. Additionally, we develop a dynamic programming algorithm for this problem as well. Our results establish a new and comprehensive theoretical framework for constrained DNA, subsuming prior models and setting the stage for future advances in the field.
翻译:基于DNA的存储技术提供了前所未有的存储密度和耐久性,但其可扩展性从根本上受到并行链合成效率的限制。现有方法要么允许对单个链进行无约束的核苷酸添加(如酶促合成),要么强制对许多链进行相同的添加(如光刻合成)。我们提出并分析了一种混合合成框架,该框架概括了这两种方法:在每个循环中,从受限子集中选择一个核苷酸并进行并行掺入。该模型引出了一个复杂合成序列的新概念。基于此框架,我们扩展了Lenz等人提出的信息率定义,并分析了一个在此设定下定义和研究的删除球模拟,推导出最大信息率及其渐近行为的紧致表达式。这些结果弥合了约束模型与每个核苷酸始终可用的理想化设定之间的理论鸿沟。对于已知链的情况,我们设计了一种动态规划算法来计算最优的复杂合成序列,突显了其与最短公共超序列问题的结构相似性。我们还定义了一个具有行合成约束的独特二维阵列模型,该模型扩展了文献中先前的合成模型,并捕捉了大规模链阵列中的新结构限制。此外,我们也为此问题开发了一种动态规划算法。我们的研究成果为约束DNA建立了一个全新且全面的理论框架,该框架包含了先前的模型,并为该领域的未来发展奠定了基础。