Molecular dynamics simulations have emerged as a fundamental instrument for studying biomolecules. At the same time, it is desirable to perform simulations of a collection of particles under various conditions in which the molecules can fluctuate. In this paper, we explore and adapt the soft prompt-based learning method to molecular dynamics tasks. Our model can remarkably generalize to unseen and out-of-distribution scenarios with limited training data. While our work focuses on temperature as a test case, the versatility of our approach allows for efficient simulation through any continuous dynamic conditions, such as pressure and volumes. Our framework has two stages: 1) Pre-trains with data mixing technique, augments molecular structure data and temperature prompts, then applies a curriculum learning method by increasing the ratio of them smoothly. 2) Meta-learning-based fine-tuning framework improves sample-efficiency of fine-tuning process and gives the soft prompt-tuning better initialization points. Comprehensive experiments reveal that our framework excels in accuracy for in-domain data and demonstrates strong generalization capabilities for unseen and out-of-distribution samples.
翻译:分子动力学模拟已成为研究生物大分子的基本工具。同时,对粒子系统在不同条件下(分子可发生波动)进行模拟具有重要意义。本文探索并改进了基于软提示的学习方法,将其应用于分子动力学任务。我们的模型在训练数据有限的条件下,能够显著泛化到未见过的及分布外场景。虽然本研究以温度为测试用例,但该方法具有通用性,可支持通过任意连续动态条件(如压力和体积)进行高效模拟。框架包含两个阶段:1)采用数据混合技术进行预训练,增强分子结构数据与温度提示,并通过平滑增加其比例应用课程学习方法;2)基于元学习的微调框架提高了微调过程的样本效率,并为软提示调优提供了更优的初始化点。大量实验表明,本框架在域内数据上具有卓越的准确性,并对未见样本及分布外样本展现出强大的泛化能力。