We propose a novel and flexible DNA-storage architecture, which divides the storage space into fixed-size units (blocks) that can be independently and efficiently accessed at random for both read and write operations, and further allows efficient sequential access to consecutive data blocks. In contrast to prior work, in our architecture a pair of random-access PCR primers of length 20 does not define a single object, but an independent storage partition, which is internally blocked and managed independently of other partitions. We expose the flexibility and constraints with which the internal address space of each partition can be managed, and incorporate them into our design to provide rich and functional storage semantics, such as block-storage organization, efficient implementation of data updates, and sequential access. To leverage the full power of the prefix-based nature of PCR addressing, we define a methodology for transforming the internal addressing scheme of a partition into an equivalent that is PCR-compatible. This allows us to run PCR with primers that can be variably elongated to include a desired part of the internal address, and thus narrow down the scope of the reaction to retrieve a specific block or a range of blocks within the partition with sufficiently high accuracy. Our wetlab evaluation demonstrates the practicality of the proposed ideas and a 140x reduction in sequencing cost and latency for retrieval of individual blocks within the partition.
翻译:我们提出了一种新颖且灵活的DNA存储架构,该架构将存储空间划分为固定大小的单元(块),这些单元可独立且高效地随机访问以进行读写操作,并进一步支持对连续数据块的高效顺序访问。与先前工作不同,在我们的架构中,一对长度为20的随机存取PCR引物并非定义单个对象,而是定义一个独立的存储分区,该分区内部进行块管理,并独立于其他分区运行。我们揭示了每个分区内部地址空间管理的灵活性与限制,并将其融入设计中,以提供丰富且实用的存储语义,例如块存储组织、数据更新的高效实现以及顺序访问。为了充分利用PCR寻址基于前缀的特性,我们定义了一种方法,将分区的内部寻址方案转换为与PCR兼容的等价方案。这使得我们能够使用可变延长的引物运行PCR,这些引物可包含内部地址的所需部分,从而将反应范围缩小到以足够高的精度检索分区内的特定块或块范围。我们的湿实验评估证明了所提想法的实用性,并将分区内单个块的检索测序成本和延迟降低了140倍。