Commit messages explain code changes in a commit and facilitate collaboration among developers. Several commit message generation approaches have been proposed; however, they exhibit limited success in capturing the context of code changes. We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and leverages a transformer-based model to generate high-quality commit messages. Our proposed method utilizes delta graph that we developed to effectively represent code differences. We also introduce a customizable quality assurance module to identify optimal messages, mitigating subjectivity in commit messages. Experiments show that Comet outperforms state-of-the-art techniques in terms of bleu-norm and meteor metrics while being comparable in terms of rogue-l. Additionally, we compare the proposed approach with the popular gpt-3.5-turbo model, along with gpt-4-turbo; the most capable GPT model, over zero-shot, one-shot, and multi-shot settings. We found Comet outperforming the GPT models, on five and four metrics respectively and provide competitive results with the two other metrics. The study has implications for researchers, tool developers, and software developers. Software developers may utilize Comet to generate context-aware commit messages. Researchers and tool developers can apply the proposed delta graph technique in similar contexts, like code review summarization.
翻译:提交信息解释了代码提交中的变更,并促进开发者之间的协作。尽管已有多种提交信息生成方法被提出,但它们在捕捉代码变更上下文方面仍存在局限性。本文提出Comet(上下文感知提交信息生成)——一种利用图表示捕捉代码变更上下文的新颖方法,并基于Transformer模型生成高质量的提交信息。该方法采用我们开发的差异图(delta graph)有效表示代码差异,同时引入可定制化的质量保证模块以识别最优信息,从而缓解提交信息的主观性。实验表明,Comet在BLEU-norm和METEOR指标上优于现有技术,在ROUGE-L指标上则表现相当。此外,我们将所提方法与当前流行的GPT-3.5-turbo及性能最强的GPT-4-turbo模型在零样本、单样本及多样本设置下进行对比。结果显示Comet在五个和四个指标上分别超越GPT模型,并在其余两个指标中展现出具有竞争力的性能。本研究对研究人员、工具开发者及软件开发者具有重要启示:软件开发者可利用Comet生成上下文感知的提交信息;研究人员与工具开发者可将所提差异图技术应用于类似场景(如代码审查摘要生成)。