Unit testing is a crucial, yet often tedious and time-consuming task. To relieve developers from this burden, automated unit test generation techniques are developed. Existing automated unit test generation tools, such as program-analysis-based tools like EvoSuite and Randoop, lack program comprehension, resulting in unit tests with poor readability and limited assertions. Language-model-based tools, such as AthenaTest and A3Test, have limitations in the generation of correct unit tests. In this paper, we introduce ChatUniTest, a ChatGPT-based automated unit test generation tool developed under the Generation-Validation-Repair framework. ChatUniTest generates tests by parsing the project, extracting essential information, and creating an adaptive focal context that includes the focal method and its dependencies within the pre-defined maximum prompt token limit. The context is incorporated into a prompt and subsequently submitted to ChatGPT. Once ChatGPT's response is received, ChatUniTest proceeds to extract the raw test from the response. It then validates the test and employs rule-based repair to fix syntactic and simple compile errors, followed by ChatGPT-based repair to address challenging errors. Our rigorous evaluation demonstrates that ChatUniTest outperforms EvoSuite in branch and line coverage, surpasses AthenaTest and A3Test in focal method coverage, and effectively generates assertions while utilizing mock objects and reflection to achieve test objectives.
翻译:单元测试是一项关键但通常繁琐且耗时的任务。为了减轻开发人员的负担,研究者开发了自动化单元测试生成技术。现有的自动化单元测试生成工具,如基于程序分析的EvoSuite和Randoop,因缺乏程序理解能力,导致生成的单元测试可读性差且断言有限;而基于语言模型的工具(如AthenaTest和A3Test)在生成正确单元测试方面存在局限性。本文提出了ChatUniTest——一种基于ChatGPT的自动化单元测试生成工具,采用“生成-验证-修复”(Generation-Validation-Repair)框架。ChatUniTest通过解析项目、提取关键信息,并在预定义的最大提示令牌限制内创建包含焦点方法及其依赖关系的自适应焦点上下文来生成测试。该上下文被整合到提示中并提交给ChatGPT。接收到ChatGPT的响应后,ChatUniTest从响应中提取原始测试代码,随后验证测试并通过基于规则的修复方法修正语法错误和简单编译错误,再采用基于ChatGPT的修复策略处理复杂错误。严格评估表明:在分支覆盖率和行覆盖率上,ChatUniTest优于EvoSuite;在焦点方法覆盖率上,它胜过AthenaTest和A3Test;同时能有效生成断言,并利用模拟对象和反射机制实现测试目标。