AtomSurf : Surface Representation for Learning on Protein Structures

While there has been significant progress in evaluating and comparing different representations for learning on protein data, the role of surface-based learning approaches remains not well-understood. In particular, there is a lack of direct and fair benchmark comparison between the best available surface-based learning methods against alternative representations such as graphs. Moreover, the few existing surface-based approaches either use surface information in isolation or, at best, perform global pooling between surface and graph-based architectures. In this work, we fill this gap by first adapting a state-of-the-art surface encoder for protein learning tasks. We then perform a direct and fair comparison of the resulting method against alternative approaches within the Atom3D benchmark, highlighting the limitations of pure surface-based learning. Finally, we propose an integrated approach, which allows learned feature sharing between graphs and surface representations on the level of nodes and vertices $\textit{across all layers}$. We demonstrate that the resulting architecture achieves state-of-the-art results on all tasks in the Atom3D benchmark, while adhering to the strict benchmark protocol, as well as more broadly on binding site identification and binding pocket classification. Furthermore, we use coarsened surfaces and optimize our approach for efficiency, making our tool competitive in training and inference time with existing techniques. Our code and data can be found online: $\texttt{github.com/Vincentx15/atomsurf}$

翻译：尽管在评估和比较蛋白质数据学习的不同表示方面已取得显著进展，但基于表面的学习方法的作用仍未得到充分理解。特别是，目前缺乏现有最佳表面学习方法与图表示等替代方法之间直接且公平的基准比较。此外，现有的少数表面学习方法要么单独使用表面信息，要么至多在表面与基于图的架构之间进行全局池化。在本工作中，我们首先通过适配最先进的表面编码器用于蛋白质学习任务来填补这一空白。随后，我们在Atom3D基准测试中对所得方法与其他替代方法进行了直接公平的比较，揭示了纯表面学习方法的局限性。最后，我们提出一种集成方法，该方法允许在节点与顶点层面$\textit{跨所有层}$实现图表示与表面表示之间的学习特征共享。我们证明，所得架构在遵循严格基准协议的前提下，在Atom3D基准测试的所有任务中均达到最先进的性能水平，并在结合位点识别与结合口袋分类等更广泛任务中表现优异。此外，我们采用粗粒度表面表示并优化方法效率，使我们的工具在训练与推理时间上与现有技术相比具有竞争力。我们的代码与数据可在以下网址获取：$\texttt{github.com/Vincentx15/atomsurf}$

相关内容

Microsoft Surface

关注 5

Surface 是微软公司（ Microsoft）旗下一系列使用 Windows 10（早期为 Windows 8.X）操作系统的电脑产品，目前有 Surface、Surface Pro 和 Surface Book 三个系列。 2012 年 6 月 18 日，初代 Surface Pro/RT 由时任微软 CEO 史蒂夫·鲍尔默发布于在洛杉矶举行的记者会，2012 年 10 月 26 日上市销售。

【CVPR 2022】一个完全无监督的框架，从噪声和部分测量中学习图像，Robust Equivariant Imaging: a fully unsupervised framework for learning to image

专知会员服务

25+阅读 · 2022年3月3日

【CHI2020-微软】解释可解释性:理解数据科学家使用机器学习的可解释性工具，Interpreting Interpretability: Understanding Data Scientists’Use of Interpretability Tools for Machine Learning

专知会员服务

55+阅读 · 2020年3月8日

FlowQA: Grasping Flow in History for Conversational Machine Comprehension

专知会员服务

34+阅读 · 2019年10月18日

Auto-Sizing the Transformer Network: Improving Speed, Efficiency, and Performance for Low-Resource Machine Translation

专知会员服务

50+阅读 · 2019年10月17日