Recent advancements in Cryo-EM and protein structure prediction algorithms have made large-scale protein structures accessible, paving the way for machine learning-based functional annotations.The field of geometric deep learning focuses on creating methods working on geometric data. An essential aspect of learning from protein structures is representing these structures as a geometric object (be it a grid, graph, or surface) and applying a learning method tailored to this representation. The performance of a given approach will then depend on both the representation and its corresponding learning method. In this paper, we investigate representing proteins as $\textit{3D mesh surfaces}$ and incorporate them into an established representation benchmark. Our first finding is that despite promising preliminary results, the surface representation alone does not seem competitive with 3D grids. Building on this, we introduce a synergistic approach, combining surface representations with graph-based methods, resulting in a general framework that incorporates both representations in learning. We show that using this combination, we are able to obtain state-of-the-art results across $\textit{all tested tasks}$. Our code and data can be found online: https://github.com/Vincentx15/atom2D .
翻译:冷冻电镜和蛋白质结构预测算法的最新进展使得大规模蛋白质结构变得可获取,为基于机器学习的功能注释铺平了道路。几何深度学习领域专注于创建处理几何数据的方法。从蛋白质结构中学习的一个关键方面是将这些结构表示为几何对象(无论是网格、图还是表面),并应用针对该表示定制的学习方法。给定方法的性能将取决于表示及其对应的学习方法。在本文中,我们研究将蛋白质表示为$\textit{3D网格表面}$,并将其纳入已建立的表示基准。我们的首要发现是,尽管初步结果令人期待,但仅凭表面表示似乎无法与3D网格相竞争。基于此,我们引入了一种协同方法,将表面表示与基于图的方法相结合,从而形成一个通用的学习框架,整合了两种表示。我们证明,使用这种组合,我们能够在$\textit{所有测试任务}$上获得最先进的结果。我们的代码和数据可在以下网址获取:https://github.com/Vincentx15/atom2D。