Automated program comprehension underpins many software engineering tasks, from code summarisation to clone detection. Recent deep learning models achieve strong results but typically rely on source code alone, overlooking contextual information such as version history or structural relationships. This limits their ability to capture how code evolves and operates. We conduct an empirical study on how enriching code representations with such contextual signals affects neural model performance on key comprehension tasks. Two downstream tasks, code clone detection and code summarisation, are evaluated using SeSaMe (1,679 Java methods) and CodeSearchNet (63,259 methods). Five representative models (CodeBERT, GraphCodeBERT, CodeT5, PLBART, ASTNN) are fine-tuned under code-only and context-augmented settings. Results show that context generally improves performance: version history consistently boosts clone detection (e.g., CodeT5 +15.92% F1) and summarisation (e.g., GraphCodeBERT +5.56% METEOR), while call-graph effects vary by model and task. Combining multiple contexts yields further gains (up to +21.48% macro-F1). Human evaluation on 100 Java snippets confirms that context-augmented summaries are significantly preferred for Accuracy and Content Adequacy (p <= 0.026; |delta| up to 0.55). These findings highlight the potential of contextual signals to enhance code comprehension and open new directions for optimising contextual encoding in neural SE models.
翻译:自动化程序理解是许多软件工程任务的基础,从代码摘要到克隆检测。近期的深度学习模型取得了优异成果,但通常仅依赖源代码本身,忽略了诸如版本历史或结构关系等上下文信息。这限制了它们捕捉代码如何演化与运作的能力。我们开展了一项实证研究,探究利用此类上下文信号丰富代码表示如何影响神经模型在关键理解任务上的性能。使用SeSaMe(1,679个Java方法)和CodeSearchNet(63,259个方法)数据集,评估了两个下游任务:代码克隆检测与代码摘要。在纯代码设置和上下文增强设置下,对五个代表性模型(CodeBERT、GraphCodeBERT、CodeT5、PLBART、ASTNN)进行了微调。结果表明,上下文普遍提升了性能:版本历史持续改善克隆检测(例如CodeT5 F1值提升+15.92%)和摘要任务(例如GraphCodeBERT METEOR分数提升+5.56%),而调用图的效果则因模型和任务而异。组合多种上下文能带来进一步增益(宏观F1最高提升+21.48%)。对100个Java代码片段的人工评估证实,上下文增强的摘要结果在准确性和内容充分性方面显著更受青睐(p <= 0.026;|delta|最高达0.55)。这些发现凸显了上下文信号在增强代码理解方面的潜力,并为优化神经软件工程模型中的上下文编码开辟了新方向。