We introduce a novel fully convolutional neural network (FCN) architecture for predicting the secondary structure of ribonucleic acid (RNA) molecules. Interpreting RNA structures as weighted graphs, we employ deep learning to estimate the probability of base pairing between nucleotide residues. Unique to our model are its massive 11-pixel kernels, which we argue provide a distinct advantage for FCNs on the specialized domain of RNA secondary structures. On a widely adopted, standardized test set comprised of 1,305 molecules, the accuracy of our method exceeds that of current state-of-the-art (SOTA) secondary structure prediction software, achieving a Matthews Correlation Coefficient (MCC) over 11-40% higher than that of other leading methods on overall structures and 58-400% higher on pseudoknots specifically.
翻译:我们提出了一种新颖的全卷积神经网络(FCN)架构,用于预测核糖核酸(RNA)分子的二级结构。通过将RNA结构解释为加权图,我们利用深度学习来估计核苷酸残基之间碱基配对的可能性。我们模型的独特之处在于其庞大的11像素卷积核,我们认为这为FCN在RNA二级结构这一专门领域提供了显著优势。在一个广泛采用、由1,305个分子组成的标准化测试集上,我们方法的准确性超过了当前最先进的(SOTA)二级结构预测软件,在整体结构上的马修斯相关系数(MCC)比其他领先方法高出11-40%,在假结结构上更是高出58-400%。