Finely-tuned enzymatic pathways control cellular processes, and their dysregulation can lead to disease. Creating predictive and interpretable models for these pathways is challenging because of the complexity of the pathways and of the cellular and genomic contexts. Here we introduce Elektrum, a deep learning framework which addresses these challenges with data-driven and biophysically interpretable models for determining the kinetics of biochemical systems. First, it uses in vitro kinetic assays to rapidly hypothesize an ensemble of high-quality Kinetically Interpretable Neural Networks (KINNs) that predict reaction rates. It then employs a novel transfer learning step, where the KINNs are inserted as intermediary layers into deeper convolutional neural networks, fine-tuning the predictions for reaction-dependent in vivo outcomes. Elektrum makes effective use of the limited, but clean in vitro data and the complex, yet plentiful in vivo data that captures cellular context. We apply Elektrum to predict CRISPR-Cas9 off-target editing probabilities and demonstrate that Elektrum achieves state-of-the-art performance, regularizes neural network architectures, and maintains physical interpretability.
翻译:精细调控的酶促通路控制着细胞过程,其失调可能导致疾病。由于这些通路以及细胞和基因组背景的复杂性,构建具有预测性和可解释性的模型面临挑战。本文提出Elektrum,一个深度学习框架,通过数据驱动和生物物理可解释的模型来解决生化系统动力学测定问题。首先,该框架利用体外动力学实验快速假设一组高质量的动力学可解释神经网络(KINNs)来预测反应速率;随后采用新颖的迁移学习步骤,将KINNs作为中间层插入更深层的卷积神经网络,微调对反应依赖的体内结果的预测。Elektrum有效利用了有限但清洁的体外数据,以及复杂但丰富的捕获细胞背景的体内数据。我们将Elektrum应用于预测CRISPR-Cas9脱靶编辑概率,结果表明Elektrum达到了最先进的性能,正则化了神经网络架构,并保持了物理可解释性。