Although the expenses associated with DNA sequencing have been rapidly decreasing, the current cost stands at roughly \$1.3K/TB, which is dramatically more expensive than reading from existing archival storage solutions today. In this work, we aim to reduce not only the cost but also the latency of DNA storage by studying the DNA coverage depth problem, which aims to reduce the required number of reads to retrieve information from the storage system. Under this framework, our main goal is to understand how to optimally pair an error-correcting code with a given retrieval algorithm to minimize the sequencing coverage depth, while guaranteeing retrieval of the information with high probability. Additionally, we study the DNA coverage depth problem under the random-access setup.
翻译:尽管DNA测序的相关成本正在迅速下降,但目前的价格仍约为每太字节1.3千美元,这比从现有档案存储解决方案中读取数据的成本要昂贵得多。在这项工作中,我们旨在通过研究DNA覆盖深度问题,来同时降低DNA存储的成本和延迟,该问题的目标是减少从存储系统中检索信息所需的测序次数。在此框架下,我们的主要目标是理解如何最优地结合纠错码与给定的检索算法,以最小化测序覆盖深度,同时保证以高概率成功检索信息。此外,我们还在随机访问场景下研究了DNA覆盖深度问题。