This article introduces idMotif, a visual analytics framework designed to aid domain experts in the identification of motifs within protein sequences. Motifs, short sequences of amino acids, are critical for understanding the distinct functions of proteins. Identifying these motifs is pivotal for predicting diseases or infections. idMotif employs a deep learning-based method for the categorization of protein sequences, enabling the discovery of potential motif candidates within protein groups through local explanations of deep learning model decisions. It offers multiple interactive views for the analysis of protein clusters or groups and their sequences. A case study, complemented by expert feedback, illustrates idMotif's utility in facilitating the analysis and identification of protein sequences and motifs.
翻译:本文介绍了idMotif,一个旨在帮助领域专家识别蛋白质序列中基序的可视分析框架。基序作为短氨基酸序列,对理解蛋白质的不同功能至关重要,而识别这些基序对于预测疾病或感染具有关键意义。idMotif采用基于深度学习的蛋白质序列分类方法,通过局部解释深度学习模型决策,在蛋白质组中发现潜在的基序候选。该框架提供多个交互式视图,用于分析蛋白质聚类或分组及其序列。通过案例研究与专家反馈,展示了idMotif在促进蛋白质序列与基序分析与识别中的实用性。