With the rise in engineered biomolecular devices, there is an increased need for tailor-made biological sequences. Often, many similar biological sequences need to be made for a specific application meaning numerous, sometimes prohibitively expensive, lab experiments are necessary for their optimization. This paper presents a transfer learning design of experiments workflow to make this development feasible. By combining a transfer learning surrogate model with Bayesian optimization, we show how the total number of experiments can be reduced by sharing information between optimization tasks. We demonstrate the reduction in the number of experiments using data from the development of DNA competitors for use in an amplification-based diagnostic assay. We use cross-validation to compare the predictive accuracy of different transfer learning models, and then compare the performance of the models for both single objective and penalized optimization tasks.
翻译:随着工程化生物分子器件的发展,对定制化生物序列的需求日益增长。针对特定应用常需制备大量相似生物序列,这意味着需要多次(有时成本极高)实验室实验来完成优化。本文提出一种迁移学习实验设计工作流程,使此类开发具有可行性。通过将迁移学习代理模型与贝叶斯优化相结合,我们展示了如何通过优化任务间的信息共享来减少实验总次数。我们利用基于扩增的诊断检测中竞争性DNA分子的开发数据,验证了实验次数的降低效果。通过交叉验证比较不同迁移学习模型的预测精度,并在单目标优化与惩罚性优化任务中对比各模型的性能表现。