Digital sensing faces challenges in developing sustainable methods to extend the applicability of customized e-noses to complex body odor volatilome (BOV). To address this challenge, we developed MORE-ML, a computational framework that integrates quantum-mechanical (QM) property data of e-nose molecular building blocks with machine learning (ML) methods to predict sensing-relevant properties. Within this framework, we expanded our previous dataset, MORE-Q, to MORE-QX by sampling a larger conformational space of interactions between BOV molecules and mucin-derived receptors. This dataset provides extensive electronic binding features (BFs) computed upon BOV adsorption. Analysis of MORE-QX property space revealed weak correlations between QM properties of building blocks and resulting BFs. Leveraging this observation, we defined electronic descriptors of building blocks as inputs for tree-based ML models to predict BFs. Benchmarking showed CatBoost models outperform alternatives, especially in transferability to unseen compounds. Explainable AI methods further highlighted which QM properties most influence BF predictions. Collectively, MORE-ML combines QM insights with ML to provide mechanistic understanding and rational design principles for molecular receptors in BOV sensing. This approach establishes a foundation for advancing artificial sensing materials capable of analyzing complex odor mixtures, bridging the gap between molecular-level computations and practical e-nose applications.
翻译:数字传感领域面临一项挑战:需要开发可持续的方法,将定制化电子鼻的适用范围扩展至复杂人体气味挥发组。为应对这一挑战,我们开发了MORE-ML计算框架,该框架将电子鼻分子构建单元的量子力学特性数据与机器学习方法相结合,以预测传感相关性能。在此框架内,我们通过扩大人体气味挥发组分子与粘蛋白衍生受体相互作用的构象空间采样,将原有数据集MORE-Q扩展为MORE-QX。该数据集提供了基于人体气味挥发组吸附计算获得的大量电子结合特征。对MORE-QX特性空间的分析显示,构建单元的量子力学特性与最终结合特征之间仅存在弱相关性。基于这一发现,我们将构建单元的电子描述符定义为树基机器学习模型的输入参数,用于预测结合特征。基准测试表明CatBoost模型性能优于其他模型,尤其在未知化合物的迁移预测能力方面表现突出。可解释人工智能方法进一步揭示了哪些量子力学特性对结合特征预测最具影响力。总体而言,MORE-ML通过融合量子力学洞见与机器学习技术,为人体气味挥发组传感中的分子受体提供了机理理解与理性设计原则。该方法为开发能够分析复杂气味混合物的先进仿生传感材料奠定了基础,在分子层面计算与实际电子鼻应用之间架起了桥梁。