An important aspect in the development of small molecules as drugs or agro-chemicals is their systemic availability after intravenous and oral administration.The prediction of the systemic availability from the chemical structure of a poten-tial candidate is highly desirable, as it allows to focus the drug or agrochemicaldevelopment on compounds with a favorable kinetic profile. However, such pre-dictions are challenging as the availability is the result of the complex interplaybetween molecular properties, biology and physiology and training data is rare.In this work we improve the hybrid model developed earlier [34]. We reducethe median fold change error for the total oral exposure from 2.85 to 2.35 andfor intravenous administration from 1.95 to 1.62. This is achieved by trainingon a larger data set, improving the neural network architecture as well as theparametrization of mechanistic model. Further, we extend our approach to predictadditional endpoints and to handle different covariates, like sex and dosage form.In contrast to a pure machine learning model, our model is able to predict newend points on which it has not been trained. We demonstrate this feature by1predicting the exposure over the first 24h, while the model has only been trainedon the total exposure.
翻译:小分子药物或农用化学品开发中的重要环节是静脉和口服给药后的全身暴露量。从候选化合物的化学结构预测其全身暴露量具有极高价值,这有助于将药物或农用化学品研发聚焦于具有有利动力学特征的化合物。然而,此类预测极具挑战性,因为暴露量是分子特性、生物学与生理学复杂交互作用的结果,且训练数据极为稀缺。本研究改进了前期开发的混合模型[34]。我们将口服总暴露的中位倍数变化误差从2.85降至2.35,静脉给药误差从1.95降至1.62。该改进通过扩大训练数据集、优化神经网络架构及机制模型参数化实现。此外,我们扩展了方法以预测新终点并处理不同协变量(如性别和剂型)。与纯机器学习模型不同,本模型可预测未经训练的新终点:我们通过仅基于总暴露量训练模型却成功预测前24小时暴露量的实验验证了这一特性。