Over the past years, the ever-growing trend on data storage demand, more specifically for "cold" data (i.e. rarely accessed), has motivated research for alternative systems of data storage. Because of its biochemical characteristics, synthetic DNA molecules are now considered as serious candidates for this new kind of storage. This paper introduces a novel arithmetic coder for DNA data storage, and presents some results on a lossy JPEG 2000 based image compression method adapted for DNA data storage that uses this novel coder. The DNA coding algorithms presented here have been designed to efficiently compress images, encode them into a quaternary code, and finally store them into synthetic DNA molecules. This work also aims at making the compression models better fit the problematic that we encounter when storing data into DNA, namely the fact that the DNA writing, storing and reading methods are error prone processes. The main take away of this work is our arithmetic coder and it's integration into a performant image codec.
翻译:过去几年间,数据存储需求(特别是针对"冷数据"(即极少访问的数据))的持续增长,推动了替代性数据存储系统的研究。由于合成DNA分子的生物化学特性,它们现已被视为这类新型存储介质的严肃候选方案。本文提出了一种用于DNA数据存储的新型算术编码器,并展示了基于有损JPEG 2000图像压缩方法(适配DNA数据存储)的实验结果。本文提出的DNA编码算法旨在高效压缩图像、将其编码为四进制代码,并最终存入合成DNA分子。本研究还致力于使压缩模型更好地适配DNA存储中遇到的特殊问题,即DNA写入、存储和读取方法均属于易错过程。本工作的主要成果是所提出的算术编码器及其在高效图像编解码器中的集成应用。