Data-Driven Bee Identification for DNA Strands - 专知论文

会员服务 ·

0

可辨认的 · binary · 几乎必然 · MoDELS · Performer ·

2023 年 5 月 8 日

Data-Driven Bee Identification for DNA Strands

翻译：面向DNA链的数据驱动蜜蜂识别方法

Shubhransh Singhvi,Avital Boruchovsky,Han Mao Kiah,Eitan Yaakobi

from arxiv, Conference paper accepted at ISIT 2023

We study a data-driven approach to the bee identification problem for DNA strands. The bee-identification problem, introduced by Tandon et al. (2019), requires one to identify $M$ bees, each tagged by a unique barcode, via a set of $M$ noisy measurements. Later, Chrisnata et al. (2022) extended the model to case where one observes $N$ noisy measurements of each bee, and applied the model to address the unordered nature of DNA storage systems. In such systems, a unique address is typically prepended to each DNA data block to form a DNA strand, but the address may possibly be corrupted. While clustering is usually used to identify the address of a DNA strand, this requires $\mathcal{M}^2$ data comparisons (when $\mathcal{M}$ is the number of reads). In contrast, the approach of Chrisnata et al. (2022) avoids data comparisons completely. In this work, we study an intermediate, data-driven approach to this identification task. For the binary erasure channel, we first show that we can almost surely correctly identify all DNA strands under certain mild assumptions. Then we propose a data-driven pruning procedure and demonstrate that on average the procedure uses only a fraction of $\mathcal{M}^2$ data comparisons. Specifically, for $\mathcal{M}= 2^n$ and erasure probability $p$, the expected number of data comparisons performed by the procedure is $\kappa\mathcal{M}^2$, where $\left(\frac{1+2p-p^2}{2}\right)^n \leq \kappa \leq \left(\frac{1+p}{2}\right)^n $.

翻译：我们研究了DNA链中蜜蜂识别问题的数据驱动方法。蜜蜂识别问题由Tandon等人（2019年）提出，要求通过一组$M$个噪声测量值来识别$M$只蜜蜂，每只蜜蜂均被一个独特的条形码标记。随后，Chrisnata等人（2022年）将该模型扩展至每只蜜蜂可观测到$N$个噪声测量值的情况，并将其应用于解决DNA存储系统中的无序性问题。在此类系统中，通常会在每个DNA数据块前附加一个唯一地址以形成DNA链，但该地址可能被损坏。虽然通常使用聚类方法来识别DNA链的地址，但这需要进行$\mathcal{M}^2$次数据比较（其中$\mathcal{M}$为读取序列的数量）。相比之下，Chrisnata等人（2022年）的方法完全避免了数据比较。在本工作中，我们研究了针对该识别任务的中间型数据驱动方法。对于二元删除信道，我们首先证明在特定温和假设下几乎必然能正确识别所有DNA链。然后，我们提出了一种数据驱动的剪枝程序，并证明该程序平均仅需$\mathcal{M}^2$次数据比较的一小部分。具体而言，对于$\mathcal{M}= 2^n$且删除概率为$p$，该程序执行的预期数据比较次数为$\kappa\mathcal{M}^2$，其中$\left(\frac{1+2p-p^2}{2}\right)^n \leq \kappa \leq \left(\frac{1+p}{2}\right)^n $。

0

相关内容

可辨认的

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

SIRT7的稳定性与DNA损伤修复的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

椭圆边值问题的齐性化理论及调和分析方法之研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于组学与微流控芯片技术的肺炎克雷伯菌ESBL耐药表型发生与转换机制的研究

国家自然科学基金

1+阅读 · 2014年12月31日

恶性肺结节的分类不确定性信息可视化传递函数研究

国家自然科学基金

0+阅读 · 2013年12月31日

PIWI/piRNA作用通路介导调控精子DNA损伤的研究

国家自然科学基金

0+阅读 · 2013年12月31日

肝细胞肝癌中抑癌基因DLC1表达沉默的遗传学与表观遗传学机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

ASPP2调节肝癌细胞上皮间质转化的研究

国家自然科学基金

0+阅读 · 2011年12月31日

宫颈癌干细胞的特异基因表达分析

国家自然科学基金

0+阅读 · 2009年12月31日

MCPH1在宫颈癌中的作用及其靶位治疗的研究

国家自然科学基金

0+阅读 · 2008年12月31日

Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories

Arxiv

0+阅读 · 2023年6月22日

In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD

Arxiv

0+阅读 · 2023年6月22日

New Bayesian method for estimation of Value at Risk and Conditional Value at Risk

Arxiv

0+阅读 · 2023年6月21日

Data Structures for Density Estimation

Arxiv

0+阅读 · 2023年6月20日

Bootstrap test procedure for variance components in nonlinear mixed effects models in the presence of nuisance parameters and singular Fisher Information Matrix

Arxiv

0+阅读 · 2023年6月19日

Non-asymptotic System Identification for Linear Systems with Nonlinear Policies

Arxiv

0+阅读 · 2023年6月17日

Achilles' Heels: Vulnerable Record Identification in Synthetic Data Publishing

Arxiv

0+阅读 · 2023年6月17日

Hyperbolic Graph Neural Networks: A Review of Methods and Applications

Hyperbolic Graph Neural Networks: A Review of Methods and Applications

Arxiv

28+阅读 · 2022年2月28日

Matrix Decomposition and Applications

Arxiv

54+阅读 · 2022年1月1日

A Survey on Trajectory Data Management, Analytics, and Learning

A Survey on Trajectory Data Management, Analytics, and Learning

Arxiv

16+阅读 · 2020年3月25日

VIP会员

文章信息

相关主题

最新内容

2026年美国防部人工智能政策如何将国防人工智能转向速度、规模与“人工智能优先”作战

2026年美国防部人工智能政策如何将国防人工智能转向速度、规模与“人工智能优先”作战

专知会员服务

1+阅读 · 22分钟前

《伊朗-以色列对抗中的算法化目标选定：技术现实、法律门槛与人类控制的边界》

《伊朗-以色列对抗中的算法化目标选定：技术现实、法律门槛与人类控制的边界》

专知会员服务

1+阅读 · 27分钟前

《红外图像中掩埋目标检测的深度学习方法》2026最新报告

《红外图像中掩埋目标检测的深度学习方法》2026最新报告

专知会员服务

1+阅读 · 30分钟前

《小部队领导者运用新技术训练与制胜指南》2026最新50页

《小部队领导者运用新技术训练与制胜指南》2026最新50页

专知会员服务

1+阅读 · 今天8:23

乌军利用美国“黄蜂”无人机摧毁俄军后勤

乌军利用美国“黄蜂”无人机摧毁俄军后勤

专知会员服务

5+阅读 · 6月7日

《支持作战级人机协同智能的交互式OODA流程》

《支持作战级人机协同智能的交互式OODA流程》

专知会员服务

13+阅读 · 6月7日

《军事地面机动的概率等时分析：未来自适应模型的多方法协同》

《军事地面机动的概率等时分析：未来自适应模型的多方法协同》

专知会员服务

6+阅读 · 6月7日

大语言模型与物联网：大语言模型与物联网融合全面综述

大语言模型与物联网：大语言模型与物联网融合全面综述

专知会员服务

11+阅读 · 6月7日

【伯克利博士论文】基于动作分块策略的强化学习

【伯克利博士论文】基于动作分块策略的强化学习

专知会员服务

5+阅读 · 6月7日

Transformer增强强化学习：通信网络基础与应用综述

Transformer增强强化学习：通信网络基础与应用综述

专知会员服务

5+阅读 · 6月7日

ICML 2026 | SARDI：扩散语言模型的自增强检索

ICML 2026 | SARDI：扩散语言模型的自增强检索

专知会员服务

8+阅读 · 6月6日

长时程具身智能安全综述：机器人操作的跨层分析

长时程具身智能安全综述：机器人操作的跨层分析

专知会员服务

10+阅读 · 6月6日

从“杀伤链”到“杀伤网”：新时代防空反导体系的真正需求

从“杀伤链”到“杀伤网”：新时代防空反导体系的真正需求

专知会员服务

15+阅读 · 6月6日

《锻造军官能力：军官发展的军事训练、学术教育及设计思维导向创新的多维度研究》最新300页

《锻造军官能力：军官发展的军事训练、学术教育及设计思维导向创新的多维度研究》最新300页

专知会员服务

10+阅读 · 6月6日

《国防领域安全采用大语言模型的战略蓝图》

《国防领域安全采用大语言模型的战略蓝图》

专知会员服务

12+阅读 · 6月6日

相关VIP内容

Linux导论，Introduction to Linux，96页ppt

Linux导论，Introduction to Linux，96页ppt

专知会员服务

82+阅读 · 2020年7月26日

50+篇《神经架构搜索NAS》2020论文合集

专知会员服务

61+阅读 · 2020年3月19日

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

图像分类技巧集，17页ppt《Bag of Tricks for Image Classification》

专知会员服务

96+阅读 · 2020年3月12日

UC.Berkeley CS189讲义教材:《机器学习全面指南》，185页pdf

专知会员服务

162+阅读 · 2020年1月16日

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

Aspect-Oriented Syntax Network for Aspect-Based Sentiment Analysis，中山大学数据科学与计算机学院权小军教授，第八届全国社会媒体处理大会SMP2019

专知会员服务

19+阅读 · 2019年10月22日

Stabilizing Transformers for Reinforcement Learning

Stabilizing Transformers for Reinforcement Learning

专知会员服务

60+阅读 · 2019年10月17日

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

Deep Learning Based Detection and Correction of Cardiac MR Motion Artefacts During Reconstruction for High-Quality Segmentation

专知会员服务

59+阅读 · 2019年10月17日

强化学习最新教程，17页pdf

强化学习最新教程，17页pdf

专知会员服务

182+阅读 · 2019年10月11日

[综述]深度学习下的场景文本检测与识别

[综述]深度学习下的场景文本检测与识别

专知会员服务

78+阅读 · 2019年10月10日

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

【CMU卡内基梅隆大学】深度学习在计算机视觉的应用：方法，解释，因果与公平性

专知会员服务

84+阅读 · 2019年10月9日

热门VIP内容

开通专知VIP会员享更多权益服务

《伊朗-以色列对抗中的算法化目标选定：技术现实、法律门槛与人类控制的边界》

《小部队领导者运用新技术训练与制胜指南》2026最新50页

2026年美国防部人工智能政策如何将国防人工智能转向速度、规模与“人工智能优先”作战

《红外图像中掩埋目标检测的深度学习方法》2026最新报告

相关资讯

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

直播 | Interpretable and Trustworthy Graph Geometric Deep Learning

图与推荐

2+阅读 · 2022年11月2日

Hierarchically Structured Meta-learning

Hierarchically Structured Meta-learning

CreateAMind

27+阅读 · 2019年5月22日

Transferring Knowledge across Learning Processes

Transferring Knowledge across Learning Processes

CreateAMind

29+阅读 · 2019年5月18日

强化学习的Unsupervised Meta-Learning

强化学习的Unsupervised Meta-Learning

CreateAMind

18+阅读 · 2019年1月7日

Unsupervised Learning via Meta-Learning

Unsupervised Learning via Meta-Learning

CreateAMind

44+阅读 · 2019年1月3日

A Technical Overview of AI & ML in 2018 & Trends for 2019

A Technical Overview of AI & ML in 2018 & Trends for 2019

待字闺中

18+阅读 · 2018年12月24日

【推荐】MXNet深度情感分析实战

【推荐】MXNet深度情感分析实战

机器学习研究会

16+阅读 · 2017年10月4日

【论文】图上的表示学习综述

【论文】图上的表示学习综述

机器学习研究会

15+阅读 · 2017年9月24日

【推荐】全卷积语义分割综述

【推荐】全卷积语义分割综述

机器学习研究会

19+阅读 · 2017年8月31日

【推荐】SVM实例教程

【推荐】SVM实例教程

机器学习研究会

17+阅读 · 2017年8月26日

相关论文

Semi-Supervised Offline Reinforcement Learning with Action-Free Trajectories

Arxiv

0+阅读 · 2023年6月22日

In Situ Framework for Coupling Simulation and Machine Learning with Application to CFD

Arxiv

0+阅读 · 2023年6月22日

New Bayesian method for estimation of Value at Risk and Conditional Value at Risk

Arxiv

0+阅读 · 2023年6月21日

Data Structures for Density Estimation

Arxiv

0+阅读 · 2023年6月20日

Bootstrap test procedure for variance components in nonlinear mixed effects models in the presence of nuisance parameters and singular Fisher Information Matrix

Arxiv

0+阅读 · 2023年6月19日

Non-asymptotic System Identification for Linear Systems with Nonlinear Policies

Arxiv

0+阅读 · 2023年6月17日

Achilles' Heels: Vulnerable Record Identification in Synthetic Data Publishing

Arxiv

0+阅读 · 2023年6月17日

Hyperbolic Graph Neural Networks: A Review of Methods and Applications

Hyperbolic Graph Neural Networks: A Review of Methods and Applications

Arxiv

28+阅读 · 2022年2月28日

Matrix Decomposition and Applications

Arxiv

54+阅读 · 2022年1月1日

A Survey on Trajectory Data Management, Analytics, and Learning

A Survey on Trajectory Data Management, Analytics, and Learning

Arxiv

16+阅读 · 2020年3月25日

相关基金

SIRT7的稳定性与DNA损伤修复的机制研究

国家自然科学基金

0+阅读 · 2014年12月31日

椭圆边值问题的齐性化理论及调和分析方法之研究

国家自然科学基金

0+阅读 · 2014年12月31日

基于组学与微流控芯片技术的肺炎克雷伯菌ESBL耐药表型发生与转换机制的研究

国家自然科学基金

1+阅读 · 2014年12月31日

恶性肺结节的分类不确定性信息可视化传递函数研究

国家自然科学基金

0+阅读 · 2013年12月31日

PIWI/piRNA作用通路介导调控精子DNA损伤的研究

国家自然科学基金

0+阅读 · 2013年12月31日

肝细胞肝癌中抑癌基因DLC1表达沉默的遗传学与表观遗传学机制

国家自然科学基金

0+阅读 · 2012年12月31日

基于Decorin基因甲基化调控的非小细胞肺癌转移的分子机制

国家自然科学基金

0+阅读 · 2011年12月31日

ASPP2调节肝癌细胞上皮间质转化的研究

国家自然科学基金

0+阅读 · 2011年12月31日

宫颈癌干细胞的特异基因表达分析

国家自然科学基金

0+阅读 · 2009年12月31日

MCPH1在宫颈癌中的作用及其靶位治疗的研究

国家自然科学基金

0+阅读 · 2008年12月31日

微信扫码咨询专知VIP会员