Many processes in biology and drug discovery involve various 3D interactions between different molecules, such as protein and protein, protein and small molecule, etc. Designing a generalist model to learn universal molecular interactions is valuable yet challenging, given that different molecules are usually represented in different granularity. In this paper, we first propose to universally represent a 3D molecule as a geometric graph of sets, in contrast to conventional single-level representations. Upon the proposed unified representation, we then propose a Generalist Equivariant Transformer (GET) to effectively capture both sparse block-level and dense atom-level interactions. To be specific, GET consists of a bilevel attention module, a feed-forward module and a layer normalization module, where, notably, each module is E(3) equivariant to meet the symmetry of 3D world. Extensive experiments on the prediction of protein-protein affinity, ligand binding affinity, and ligand efficacy prediction verify the effectiveness of our proposed method against existing methods, and reveal its potential to learn transferable knowledge across different domains and different tasks.
翻译:许多生物学和药物发现过程涉及不同分子之间的各种3D相互作用,例如蛋白质与蛋白质、蛋白质与小分子等。设计一个通用模型来学习通用的分子相互作用具有重要价值但也充满挑战,因为不同分子通常以不同的粒度表示。在本文中,我们首先提出将3D分子统一表示为集合的几何图,这与传统的单级表示方法不同。基于提出的统一表示,我们进一步提出通用等变Transformer(GET)以有效捕获稀疏的块级和密集的原子级相互作用。具体而言,GET包含双层注意力模块、前馈模块和层归一化模块,值得注意的是,每个模块均为E(3)等变的,以满足3D世界的对称性。在蛋白质-蛋白质亲和力预测、配体结合亲和力预测以及配体功效预测上的大量实验验证了所提方法相对于现有方法的有效性,并揭示了其在不同领域和不同任务间学习可迁移知识的潜力。