Code comments are essential for clarifying code functionality, improving readability, and facilitating collaboration among developers. Despite their importance, comments often become outdated, leading to inconsistencies with the corresponding code. This can mislead developers and potentially introduce bugs. Our research investigates the impact of code-comment inconsistency on bug introduction using large language models, specifically GPT-3.5. We first compare the performance of the GPT-3.5 model with other state-of-the-art methods in detecting these inconsistencies, demonstrating the superiority of GPT-3.5 in this domain. Additionally, we analyze the temporal evolution of code-comment inconsistencies and their effect on bug proneness over various timeframes using GPT-3.5 and Odds ratio analysis. Our findings reveal that inconsistent changes are around 1.5 times more likely to lead to a bug-introducing commit than consistent changes, highlighting the necessity of maintaining consistent and up-to-date comments in software development. This study provides new insights into the relationship between code-comment inconsistency and software quality, offering a comprehensive analysis of its impact over time, demonstrating that the impact of code-comment inconsistency on bug introduction is highest immediately after the inconsistency is introduced and diminishes over time.
翻译:代码注释对于阐明代码功能、提升可读性以及促进开发者协作至关重要。尽管注释十分重要,但它们常常会变得过时,从而导致与对应代码之间的不一致。这可能会误导开发者并可能引入缺陷。本研究利用大语言模型,特别是GPT-3.5,探究了代码-注释不一致性对缺陷引入的影响。我们首先比较了GPT-3.5模型与其他最先进方法在检测此类不一致性方面的性能,证明了GPT-3.5在该领域的优越性。此外,我们利用GPT-3.5和比值比分析,分析了代码-注释不一致性的时间演变及其在不同时间段内对缺陷倾向性的影响。我们的研究结果表明,与一致的变更相比,不一致的变更导致缺陷引入提交的可能性大约高出1.5倍,这凸显了在软件开发中保持注释一致和最新的必要性。本研究为代码-注释不一致性与软件质量之间的关系提供了新的见解,对其随时间推移的影响进行了全面分析,表明代码-注释不一致性对缺陷引入的影响在引入不一致性后立即达到最高,并随着时间的推移而减弱。