With the introduction of cyber-physical genome sequencing and editing technologies, such as CRISPR, researchers can more easily access tools to investigate and create remedies for a variety of topics in genetics and health science (e.g. agriculture and medicine). As the field advances and grows, new concerns present themselves in the ability to predict the off-target behavior. In this work, we explore the underlying biological and chemical model from a data driven perspective. Additionally, we present a machine learning based solution named \textit{Guide-Guard} to predict the behavior of the system given a gRNA in the CRISPR gene-editing process with 84\% accuracy. This solution is able to be trained on multiple different genes at the same time while retaining accuracy.
翻译:随着信息物理基因组测序与编辑技术(如CRISPR)的引入,研究人员能够更便捷地获取工具,以探索并开发针对遗传学与健康科学领域(如农业与医学)诸多课题的解决方案。随着该领域的发展与拓展,预测脱靶行为的能力成为新的关注焦点。本研究从数据驱动的视角探讨了潜在的生物与化学模型。此外,我们提出了一种基于机器学习的解决方案——\textit{Guide-Guard},能够在CRISPR基因编辑过程中针对给定gRNA预测系统行为,准确率达84%。该方案能够同时对多个不同基因进行训练,并保持预测准确性。