In response to the evolving landscape of data storage, researchers have increasingly explored non-traditional platforms, with DNA-based storage emerging as a cutting-edge solution. Our work is motivated by the potential of in-vivo DNA storage, known for its capacity to store vast amounts of information efficiently and confidentially within an organism's native DNA. While promising, in-vivo DNA storage faces challenges, including susceptibility to errors introduced by mutations. To understand the long-term behavior of such mutation systems, we investigate the frequency of $k$-tuples after multiple mutation applications. Drawing inspiration from related works, we generalize results from the study of mutation systems, particularly focusing on the frequency of $k$-tuples. In this work, we provide a broad analysis through the construction of a specialized matrix and the identification of its eigenvectors. In the context of substitution and duplication systems, we leverage previous results on almost sure convergence, equating the expected frequency to the limiting frequency. Moreover, we demonstrate convergence in probability under certain assumptions.
翻译:为应对数据存储技术的持续演进,研究者们不断探索非传统存储平台,其中基于DNA的存储正作为一项前沿解决方案崭露头角。我们的研究受体内DNA存储潜力的启发——这种技术以其在生物体天然DNA中高效、隐秘地存储海量信息的能力著称。尽管前景广阔,但体内DNA存储仍面临挑战,包括对突变引入错误的敏感性。为理解此类突变系统的长期行为,我们研究了多次突变操作后$k$元组的频率变化规律。借鉴相关研究成果,我们推广了突变系统研究的结论,重点关注$k$元组的频率特性。在本工作中,我们通过构建特定矩阵并识别其特征向量,提供了广泛的理论分析。针对替换系统和重复系统,我们利用已有关于几乎必然收敛的结论,将期望频率与极限频率等同。此外,我们证明了在特定假设条件下,系统在概率意义下的收敛性。