Ongoing advances in force field and computer hardware development enable the use of molecular dynamics (MD) to simulate increasingly complex systems with the ultimate goal of reaching cellular complexity. At the same time, rational design by high-throughput (HT) simulations is another forefront of MD. In these areas, the Martini coarse-grained force field, especially the latest version (i.e. v3), is being actively explored because it offers enhanced spatial-temporal resolution. However, the automation tools for preparing simulations with the Martini force field, accompanying the previous version, were not designed for HT simulations or studies of complex cellular systems. Therefore, they become a major limiting factor. To address these shortcomings, we present the open-source vermouth python library. Vermouth is designed to become the unified framework for developing programs, which prepare, run, and analyze Martini simulations of complex systems. To demonstrate the power of the vermouth library, the martinize2 program is showcased as a generalization of the martinize script, originally aimed to set up simulations of proteins. In contrast to the previous version, martinize2 automatically handles protonation states in proteins and post-translation modifications, offers more options to fine-tune structural biases such as the elastic network, and can convert non-protein molecules such as ligands. Finally, martinize2 is used in two high-complexity benchmarks. The entire I-TASSER protein template database as well as a subset of 200,000 structures from the AlphaFold Protein Structure Database are converted to CG resolution and we illustrate how the checks on input structure quality can safeguard HT applications.
翻译:力场与计算机硬件的持续发展,使得分子动力学(MD)模拟日益复杂的系统成为可能,其最终目标是实现细胞级别的计算复杂度。同时,高通量(HT)模拟驱动的理性设计是MD领域的另一前沿。在这些应用中,Martini粗粒化力场(特别是最新版本v3)因其增强的时空分辨率而备受关注。然而,伴随旧版本Martini力场开发的模拟制备自动化工具,并未针对HT模拟或复杂细胞系统研究而设计,因此成为关键限制因素。为克服这些不足,我们提出开源Vermouth Python库。Vermouth旨在构建统一框架,用于开发制备、运行和分析Martini复杂系统模拟的程序。为展示Vermouth库的功能,我们以martinize2程序为例,它是原始martinize脚本(用于蛋白质模拟搭建)的泛化版本。与旧版不同,martinize2能自动处理蛋白质质子化状态与翻译后修饰,提供更多选项精细调控弹性网络等结构偏置,并可转换配体等非蛋白质分子。最后,我们在两项高复杂度基准测试中运用martinize2:将整个I-TASSER蛋白质模板数据库及AlphaFold蛋白质结构数据库中20万个子集结构转换为粗粒化分辨率,并展示输入结构质量检查如何保障HT应用。