Understanding the pivotal role of oxygen-containing organic compounds in serving as an energy source for living organisms and contributing to protein formation is crucial in the field of biochemistry. This study addresses the challenge of comprehending protein-protein interactions (PPI) and developing predicitive models for proteins and organic compounds, with a specific focus on quantifying their binding affinity. Here, we introduce the active Bayesian Committee Machine (BCM) potential, specifically designed to predict oxygen-containing organic compounds within eight groups of CHO. The BCM potential adopts a committee-based approach to tackle scalability issues associated with kernel regressors, particularly when dealing with large datasets. Its adaptable structure allows for efficient and cost-effective expansion, maintaing both transferability and scalability. Through systematic benchmarking, we position the sparse BCM potential as a promising contender in the pursuit of a universal machine learning potential.
翻译:理解含氧有机物在作为生物体能量来源及参与蛋白质形成中的关键作用,是生物化学领域的重要课题。本研究致力于解决蛋白质-蛋白质相互作用(PPI)的理解难题,并开发预测蛋白质及有机化合物的模型,重点聚焦于定量评估其结合亲和力。我们提出了主动贝叶斯委员会机器(BCM)势,该势专门设计用于预测CHO八类基团中的含氧有机物。BCM势采用基于委员会的策略,以应对核回归器在处理大规模数据集时面临的可扩展性问题。其灵活的结构支持高效且低成本的扩展,同时保持迁移性与可扩展性。通过系统性的基准测试,我们将稀疏BCM势定位为通用机器学习势领域具有前景的竞争者。