基于能量的流匹配用于生成三维分子结构 (Energy-Based Flow Matching for Generating 3D Molecular Structure)

Molecular structure generation is a fundamental problem that involves determining the 3D positions of molecules' constituents. It has crucial biological applications, such as molecular docking, protein folding, and molecular design. Recent advances in generative modeling, such as diffusion models and flow matching, have made great progress on these tasks by modeling molecular conformations as a distribution. In this work, we focus on flow matching and adopt an energy-based perspective to improve training and inference of structure generation models. Our view results in a mapping function, represented by a deep network, that is directly learned to \textit{iteratively} map random configurations, i.e. samples from the source distribution, to target structures, i.e. points in the data manifold. This yields a conceptually simple and empirically effective flow matching setup that is theoretically justified and has interesting connections to fundamental properties such as idempotency and stability, as well as the empirically useful techniques such as structure refinement in AlphaFold. Experiments on protein docking as well as protein backbone generation consistently demonstrate the method's effectiveness, where it outperforms recent baselines of task-associated flow matching and diffusion models, using a similar computational budget.

翻译：分子结构生成是一个基础性问题，涉及确定分子组成成分的三维位置。该问题具有重要的生物学应用，例如分子对接、蛋白质折叠和分子设计。生成建模（如扩散模型和流匹配）的最新进展通过将分子构象建模为分布，在这些任务上取得了重大进展。在本工作中，我们聚焦于流匹配，并采用基于能量的视角来改进结构生成模型的训练与推理。我们的视角产生了一个由深度网络表示的映射函数，该函数被直接学习以*迭代地*将随机构型（即来自源分布的样本）映射到目标结构（即数据流形中的点）。这形成了一个概念简单且经验有效的流匹配框架，该框架具有理论依据，并与幂等性、稳定性等基本性质以及AlphaFold中经验有效的结构精修等技术存在有趣联系。在蛋白质对接及蛋白质骨架生成任务上的实验一致证明了该方法的有效性：在相似计算预算下，其性能优于近期基于任务关联流匹配和扩散模型的基线方法。

相关内容

MoDELS

关注 45

ACM/IEEE第23届模型驱动工程语言和系统国际会议，是模型驱动软件和系统工程的首要会议系列，由ACM-SIGSOFT和IEEE-TCSE支持组织。自1998年以来，模型涵盖了建模的各个方面，从语言和方法到工具和应用程序。模特的参加者来自不同的背景，包括研究人员、学者、工程师和工业专业人士。MODELS 2019是一个论坛，参与者可以围绕建模和模型驱动的软件和系统交流前沿研究成果和创新实践经验。今年的版本将为建模社区提供进一步推进建模基础的机会，并在网络物理系统、嵌入式系统、社会技术系统、云计算、大数据、机器学习、安全、开源等新兴领域提出建模的创新应用以及可持续性。官网链接：http://www.modelsconference.org/

【NeurIPS2021】用于文本图表示学习的 GNN 嵌套 Transformer 模型：GraphFormers

专知会员服务

46+阅读 · 2021年11月24日

【亚马逊-WWW2020】不解析,生成!用于面向任务的语义分析的序列到序列体系结构，Don't Parse, Generate! A Sequence to Sequence Architecture for Task-Oriented Semantic Parsing

专知会员服务

15+阅读 · 2020年2月1日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日