Consistency and reliability are crucial for conducting AI research. Many famous research fields, such as object detection, have been compared and validated with solid benchmark frameworks. After AlphaFold2, the protein folding task has entered a new phase, and many methods are proposed based on the component of AlphaFold2. The importance of a unified research framework in protein folding contains implementations and benchmarks to consistently and fairly compare various approaches. To achieve this, we present Solvent, a protein folding framework that supports significant components of state-of-the-art models in the manner of an off-the-shelf interface Solvent contains different models implemented in a unified codebase and supports training and evaluation for defined models on the same dataset. We benchmark well-known algorithms and their components and provide experiments that give helpful insights into the protein structure modeling field. We hope that Solvent will increase the reliability and consistency of proposed models and give efficiency in both speed and costs, resulting in acceleration on protein folding modeling research. The code is available at https://github.com/kakaobrain/solvent, and the project will continue to be developed.
翻译:摘要:一致性与可靠性对于开展人工智能研究至关重要。许多著名研究领域(如目标检测)已通过坚实的基准框架进行了比较与验证。继AlphaFold2之后,蛋白质折叠任务进入新阶段,许多方法基于AlphaFold2的组件被提出。在蛋白质折叠领域,一个统一的研究框架至关重要,它应包含实现方法及基准测试,以一致且公平地比较各种方法。为实现这一目标,我们提出Solvent——一个以即用型接口形式支持最先进模型核心组件的蛋白质折叠框架。Solvent包含统一代码库中实现的多种模型,支持在相同数据集上对指定模型进行训练与评估。我们对知名算法及其组件进行基准测试,并通过实验为蛋白质结构建模领域提供有价值的见解。我们期望Solvent能提升所提出模型的可信度与一致性,同时在运算速度与成本方面提升效率,从而加速蛋白质折叠建模研究。代码发布于https://github.com/kakaobrain/solvent,项目将持续开发。