Phage display is a powerful laboratory technique used to study the interactions between proteins and other molecules, whether other proteins, peptides, DNA or RNA. The under-utilisation of this data in conjunction with deep learning models for protein design may be attributed to; high experimental noise levels; the complex nature of data pre-processing; and difficulty interpreting these experimental results. In this work, we propose a novel approach utilising a Bayesian Neural Network within a training loop, in order to simulate the phage display experiment and its associated noise. Our goal is to investigate how understanding the experimental noise and model uncertainty can enable the reliable application of such models to reliably interpret phage display experiments. We validate our approach using actual binding affinity measurements instead of relying solely on proxy values derived from 'held-out' phage display rounds.
翻译:噬菌体展示是一种强大的实验室技术,用于研究蛋白质与其他分子(无论是其他蛋白质、多肽、DNA 还是 RNA)之间的相互作用。该数据与用于蛋白质设计的深度学习模型结合应用不足,可能归因于:实验噪声水平高;数据预处理的复杂性;以及这些实验结果难以解释。在本工作中,我们提出了一种新颖方法,在训练循环中利用贝叶斯神经网络,以模拟噬菌体展示实验及其相关噪声。我们的目标是探究如何通过理解实验噪声和模型不确定性,使此类模型能够可靠地应用于解释噬菌体展示实验。我们使用实际的结合亲和力测量值来验证我们的方法,而非仅依赖于从“留出”的噬菌体展示轮次中得出的代理值。