When applying the Transformer architecture to source code, designing a good self-attention mechanism is critical as it affects how node relationship is extracted from the Abstract Syntax Trees (ASTs) of the source code. We present Code Structure Aware Transformer (CSA-Trans), which uses Code Structure Embedder (CSE) to generate specific PE for each node in AST. CSE generates node Positional Encoding (PE) using disentangled attention. To further extend the self-attention capability, we adopt Stochastic Block Model (SBM) attention. Our evaluation shows that our PE captures the relationships between AST nodes better than other graph-related PE techniques. We also show through quantitative and qualitative analysis that SBM attention is able to generate more node specific attention coefficients. We demonstrate that CSA-Trans outperforms 14 baselines in code summarization tasks for both Python and Java, while being 41.92% faster and 25.31% memory efficient in Java dataset compared to AST-Trans and SG-Trans respectively.
翻译:在将Transformer架构应用于源代码时,设计良好的自注意力机制至关重要,因为它影响着如何从源代码的抽象语法树(AST)中提取节点关系。我们提出了代码结构感知Transformer(CSA-Trans),它利用代码结构嵌入器(CSE)为AST中的每个节点生成特定的位置编码(PE)。CSE通过解耦注意力生成节点位置编码。为进一步扩展自注意力能力,我们采用了随机块模型(SBM)注意力。实验评估表明,我们的位置编码比其它图相关位置编码技术能更好地捕捉AST节点间的关系。通过定量和定性分析,我们还证明SBM注意力能够生成更具节点特异性的注意力系数。结果表明,在Python和Java的代码摘要任务中,CSA-Trans优于14个基线模型,同时在Java数据集上相较AST-Trans和SG-Trans分别实现了41.92%的速度提升和25.31%的内存效率优化。