Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and Redecoding - 专知论文

会员服务 ·

0

解码算法 · 软解码 · 解码 · 存储 · 纠错码 ·

2023 年 4 月 7 日

Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and Redecoding

翻译：基于质量评分与重新解码的DNA存储迭代软解码算法

Jaeho Jeong,Hosung Park,Hee-Youl Kwak,Jong-Seon No,Hahyeon Jeon,Jeong Wook Lee,Jae-Won Kim

Ever since deoxyribonucleic acid (DNA) was considered as a next-generation data-storage medium, lots of research efforts have been made to correct errors occurred during the synthesis, storage, and sequencing processes using error correcting codes (ECCs). Previous works on recovering the data from the sequenced DNA pool with errors have utilized hard decoding algorithms based on a majority decision rule. To improve the correction capability of ECCs and robustness of the DNA storage system, we propose a new iterative soft decoding algorithm, where soft information is obtained from FASTQ files and channel statistics. In particular, we propose a new formula for log-likelihood ratio (LLR) calculation using quality scores (Q-scores) and a redecoding method which may be suitable for the error correction and detection in the DNA sequencing area. Based on the widely adopted encoding scheme of the fountain code structure proposed by Erlich et al., we use three different sets of sequenced data to show consistency for the performance evaluation. The proposed soft decoding algorithm gives 2.3% ~ 7.0% improvement of the reading number reduction compared to the state-of-the-art decoding method and it is shown that it can deal with erroneous sequenced oligo reads with insertion and deletion errors.

翻译：自脱氧核糖核酸（DNA）被视作下一代数据存储介质以来，大量研究致力于利用纠错码（ECC）纠正合成、存储和测序过程中产生的错误。以往从含有错误的测序DNA池中恢复数据的工作，通常采用基于多数表决规则的硬解码算法。为提升ECC的纠错能力及DNA存储系统的鲁棒性，我们提出了一种新的迭代软解码算法，该算法从FASTQ文件和信道统计中获取软信息。具体而言，我们提出了一种利用质量评分（Q-score）计算对数似然比（LLR）的新公式，以及一种适用于DNA测序领域纠错检测的重新解码方法。基于Erlich等人提出的喷泉码结构这一广泛采用的编码方案，我们使用三组不同的测序数据验证性能评估的一致性。与现有最优解码方法相比，所提出的软解码算法在读数缩减率上提升了2.3%至7.0%，并且能够处理含有插入和删除错误的测序寡核苷酸序列。

0

相关内容

解码算法

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【Manning新书】如何掌握Go语言？这100个100个Go语言错误得避免，691页pdf

【Manning新书】如何掌握Go语言？这100个100个Go语言错误得避免，691页pdf

专知会员服务

41+阅读 · 2022年10月1日

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

专知会员服务

107+阅读 · 2020年6月21日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Manning干货书】机器学习系统可扩展规模设计（MLS），224页PDF

【Manning干货书】机器学习系统可扩展规模设计（MLS），224页PDF

专知会员服务

76+阅读 · 2020年1月21日

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

专知会员服务

38+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

【重磅最新】ICLR2023顶会376篇深度强化学习论文得分出炉(376/4753,占比8%)

【重磅最新】ICLR2023顶会376篇深度强化学习论文得分出炉(376/4753,占比8%)

专知

3+阅读 · 2022年11月10日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

进一步改进GPT和BERT：使用Transformer的语言模型

进一步改进GPT和BERT：使用Transformer的语言模型

机器之心

16+阅读 · 2019年5月1日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

R工程化—Rest API 之plumber包

R工程化—Rest API 之plumber包

R语言中文社区

11+阅读 · 2018年12月25日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

专知

14+阅读 · 2018年1月22日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

高通量测序的可计算建模与应用基础算法

国家自然科学基金

1+阅读 · 2015年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

以PI4KIIα为靶点抗肿瘤抑制剂的筛选及优化

国家自然科学基金

0+阅读 · 2014年12月31日

基于海量样本的高性能元基因组数据分析策略和方法开发

国家自然科学基金

0+阅读 · 2012年12月31日

基于高通量测序和从头组装的癌症基因组变异位点检测方法和软件开发

国家自然科学基金

0+阅读 · 2012年12月31日

靶向Neuropilin和GPC-3受体的双靶点PET分子探针的构建和鉴定

国家自然科学基金

0+阅读 · 2012年12月31日

仅基于RNA-Seq数据拼装可变剪接转录组的计算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

荧光组合编码的高通量DNA连接测序方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

人肠道微生物组的基因预测与序列拼接方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology

Arxiv

0+阅读 · 2023年5月26日

Metaheuristic planner for cooperative multi-agent wall construction with UAVs

Arxiv

0+阅读 · 2023年5月25日

Masked and Permuted Implicit Context Learning for Scene Text Recognition

Arxiv

0+阅读 · 2023年5月25日

A continuum and computational framework for viscoelastodynamics: II. Strain-driven and energy-momentum consistent schemes

Arxiv

0+阅读 · 2023年5月25日

Accelerated solutions of convection-dominated partial differential equations using implicit feature tracking and empirical quadrature

Arxiv

0+阅读 · 2023年5月25日

iPlanner: Imperative Path Planning

Arxiv

0+阅读 · 2023年5月24日

SITReg: Multi-resolution architecture for symmetric, inverse consistent, and topology preserving image registration using deformation inversion layers

Arxiv

0+阅读 · 2023年5月24日

Towards Optimizing Storage Costs on the Cloud

Arxiv

0+阅读 · 2023年5月24日

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

Arxiv

11+阅读 · 2021年12月9日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

VIP会员

文章信息

相关主题

最新内容

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

专知会员服务

5+阅读 · 今天8:00

重新思考无人机时代的生存能力

重新思考无人机时代的生存能力

专知会员服务

3+阅读 · 今天7:44

装甲突击旅：现代战争思考、战斗与组织

装甲突击旅：现代战争思考、战斗与组织

专知会员服务

2+阅读 · 今天7:28

在人工智能加速决策环境中拓展OODA循环

在人工智能加速决策环境中拓展OODA循环

专知会员服务

3+阅读 · 今天7:18

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

《廉价自杀式无人机战争的军事战略影响：乌克兰与伊朗案例研究》

专知会员服务

4+阅读 · 今天7:07

军事欺骗：供作战战术指挥官使用的工具

军事欺骗：供作战战术指挥官使用的工具

专知会员服务

3+阅读 · 今天7:03

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

ICML 2026 | CFPO：用反事实策略优化提升多模态推理

专知会员服务

4+阅读 · 6月23日

综述 | 世界动作模型：少做梦，多行动

综述 | 世界动作模型：少做梦，多行动

专知会员服务

5+阅读 · 6月23日

美以伊冲突：无人机与人工智能的运用

美以伊冲突：无人机与人工智能的运用

专知会员服务

10+阅读 · 6月23日

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

《战时图神经网络：整合以色列-伊朗冲突中的网络安全与无人机智能》最新50页文献

专知会员服务

4+阅读 · 6月23日

《特种部队在透明战场中的生存力》最新报告

《特种部队在透明战场中的生存力》最新报告

专知会员服务

5+阅读 · 6月23日

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

《自主无人机蜂群协同与控制系统：人工智能赋能的战场协同与自主任务编排平台》

专知会员服务

8+阅读 · 6月23日

《人工智能生成的零日漏洞：对未来作战的影响》

《人工智能生成的零日漏洞：对未来作战的影响》

专知会员服务

7+阅读 · 6月23日

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

《理解伙伴国在防务能力选择中的偏好：探索美国解决方案的替代选择》美智库200页报告

专知会员服务

4+阅读 · 6月23日

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

ICML 2026 | 边界嵌入塑形：用自适应对比学习破解图结构纠缠

专知会员服务

6+阅读 · 6月22日

相关VIP内容

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

宾夕法尼亚大学最新《不确定性估计》课程笔记，134页pdf，附Slides

专知会员服务

49+阅读 · 2022年11月13日

【Manning新书】如何掌握Go语言？这100个100个Go语言错误得避免，691页pdf

【Manning新书】如何掌握Go语言？这100个100个Go语言错误得避免，691页pdf

专知会员服务

41+阅读 · 2022年10月1日

【AAAI2021】对比聚类，Contrastive Clustering

【AAAI2021】对比聚类，Contrastive Clustering

专知会员服务

78+阅读 · 2021年1月30日

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

【新书】人工智能Python代码，227页pdf，Python code for Artificial Intelligence: Foundations of Computational Agents

专知会员服务

107+阅读 · 2020年6月21日

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

【CVPR2020】视觉跟踪的概率回归，Probabilistic Regression for Visual Tracking

专知会员服务

37+阅读 · 2020年3月27日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

【Manning干货书】机器学习系统可扩展规模设计（MLS），224页PDF

【Manning干货书】机器学习系统可扩展规模设计（MLS），224页PDF

专知会员服务

76+阅读 · 2020年1月21日

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

【ECML-PKDD 2019】多维时间序列和事件日志的模式挖掘和异常检测框架（A framework for pattern mining and anomalydetection in multi-dimensional time series andevent logs）

专知会员服务

38+阅读 · 2019年12月1日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

热门VIP内容

开通专知VIP会员享更多权益服务

重新思考无人机时代的生存能力

在人工智能加速决策环境中拓展OODA循环

反无人机拦截器训练与运用课程：对美国陆军部队发展的启示

装甲突击旅：现代战争思考、战斗与组织

相关资讯

【重磅最新】ICLR2023顶会376篇深度强化学习论文得分出炉(376/4753,占比8%)

【重磅最新】ICLR2023顶会376篇深度强化学习论文得分出炉(376/4753,占比8%)

专知

3+阅读 · 2022年11月10日

RoBERTa中文预训练模型：RoBERTa for Chinese

RoBERTa中文预训练模型：RoBERTa for Chinese

PaperWeekly

57+阅读 · 2019年9月16日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

进一步改进GPT和BERT：使用Transformer的语言模型

进一步改进GPT和BERT：使用Transformer的语言模型

机器之心

16+阅读 · 2019年5月1日

深度自进化聚类：Deep Self-Evolution Clustering

深度自进化聚类：Deep Self-Evolution Clustering

我爱读PAMI

15+阅读 · 2019年4月13日

R工程化—Rest API 之plumber包

R工程化—Rest API 之plumber包

R语言中文社区

11+阅读 · 2018年12月25日

disentangled-representation-papers

disentangled-representation-papers

CreateAMind

26+阅读 · 2018年9月12日

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

【论文推荐】最新5篇目标跟踪（Object Tracking）相关论文—并行跟踪和验证、光流、自动跟踪、相关滤波集成、CFNet

专知

25+阅读 · 2018年2月6日

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

【论文推荐】最新5篇视觉目标跟踪相关论文—递归神经网络、深度适应计算策略、视觉目标跟踪基准、深度核化相关滤波、检测并跟踪

专知

14+阅读 · 2018年1月22日

强化学习族谱

强化学习族谱

CreateAMind

26+阅读 · 2017年8月2日

相关论文

Clustering Method for Time-Series Images Using Quantum-Inspired Computing Technology

Arxiv

0+阅读 · 2023年5月26日

Metaheuristic planner for cooperative multi-agent wall construction with UAVs

Arxiv

0+阅读 · 2023年5月25日

Masked and Permuted Implicit Context Learning for Scene Text Recognition

Arxiv

0+阅读 · 2023年5月25日

A continuum and computational framework for viscoelastodynamics: II. Strain-driven and energy-momentum consistent schemes

Arxiv

0+阅读 · 2023年5月25日

Accelerated solutions of convection-dominated partial differential equations using implicit feature tracking and empirical quadrature

Arxiv

0+阅读 · 2023年5月25日

iPlanner: Imperative Path Planning

Arxiv

0+阅读 · 2023年5月24日

SITReg: Multi-resolution architecture for symmetric, inverse consistent, and topology preserving image registration using deformation inversion layers

Arxiv

0+阅读 · 2023年5月24日

Towards Optimizing Storage Costs on the Cloud

Arxiv

0+阅读 · 2023年5月24日

Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation

Arxiv

11+阅读 · 2021年12月9日

Contrastive Clustering

Arxiv

31+阅读 · 2020年9月21日

相关基金

高通量测序的可计算建模与应用基础算法

国家自然科学基金

1+阅读 · 2015年12月31日

概率抽样设计及其统计推断方法

国家自然科学基金

6+阅读 · 2014年12月31日

以PI4KIIα为靶点抗肿瘤抑制剂的筛选及优化

国家自然科学基金

0+阅读 · 2014年12月31日

基于海量样本的高性能元基因组数据分析策略和方法开发

国家自然科学基金

0+阅读 · 2012年12月31日

基于高通量测序和从头组装的癌症基因组变异位点检测方法和软件开发

国家自然科学基金

0+阅读 · 2012年12月31日

靶向Neuropilin和GPC-3受体的双靶点PET分子探针的构建和鉴定

国家自然科学基金

0+阅读 · 2012年12月31日

仅基于RNA-Seq数据拼装可变剪接转录组的计算方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

荧光组合编码的高通量DNA连接测序方法研究

国家自然科学基金

0+阅读 · 2012年12月31日

de novo预测蛋白质结构的并行元启发方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

人肠道微生物组的基因预测与序列拼接方法研究

国家自然科学基金

0+阅读 · 2009年12月31日

微信扫码咨询专知VIP会员