Computer-aided molecular design (CAMD) studies quantitative structure-property relationships and discovers desired molecules using optimization algorithms. With the emergence of machine learning models, CAMD score functions may be replaced by various surrogates to automatically learn the structure-property relationships. Due to their outstanding performance on graph domains, graph neural networks (GNNs) have recently appeared frequently in CAMD. But using GNNs introduces new optimization challenges. This paper formulates GNNs using mixed-integer programming and then integrates this GNN formulation into the optimization and machine learning toolkit OMLT. To characterize and formulate molecules, we inherit the well-established mixed-integer optimization formulation for CAMD and propose symmetry-breaking constraints to remove symmetric solutions caused by graph isomorphism. In two case studies, we investigate fragment-based odorant molecular design with more practical requirements to test the compatibility and performance of our approaches.
翻译:计算机辅助分子设计(CAMD)通过优化算法研究定量构效关系并发现目标分子。随着机器学习模型的出现,CAMD中的评分函数可被各类代理模型替代,以自动学习构效关系。由于图神经网络(GNN)在图域上的卓越性能,近年来其在CAMD中的应用日益频繁。但GNN的使用带来了新的优化挑战。本文采用混合整数规划对GNN进行建模,并将该GNN模型集成至优化与机器学习工具包OMLT中。为表征和构建分子,我们继承了CAMD领域成熟的混合整数优化框架,并提出对称性破缺约束以消除图同构引发的对称解。在两个案例研究中,我们基于更具实际需求的片段化气味分子设计场景,检验了所提方法的兼容性与性能。