Automated test generation is essential for software quality assurance, with coverage rate serving as a key metric to ensure thorough testing. Recent advancements in Large Language Models (LLMs) have shown promise in improving test generation, particularly in achieving higher coverage. However, while existing LLM-based test generation solutions perform well on small, isolated code snippets, they struggle when applied to complex methods under test. To address these issues, we propose a scalable LLM-based unit test generation method. Our approach consists of two key steps. The first step is context information retrieval, which uses both LLMs and static analysis to gather relevant contextual information associated with the complex methods under test. The second step, iterative test generation with code elimination, repeatedly generates unit tests for the code slice, tracks the achieved coverage, and selectively removes code segments that have already been covered. This process simplifies the testing task and mitigates issues arising from token limits or reduced reasoning effectiveness associated with excessively long contexts. Through comprehensive evaluations on open-source projects, our approach outperforms state-of-the-art LLM-based and search-based methods, demonstrating its effectiveness in achieving high coverage on complex methods.
翻译:自动化测试生成对于软件质量保障至关重要,其中覆盖率是确保测试充分性的关键指标。大型语言模型(LLM)的最新进展在改进测试生成方面展现出潜力,尤其是在实现更高覆盖率方面。然而,尽管现有的基于LLM的测试生成方案在小型独立代码片段上表现良好,但在应用于复杂被测方法时仍面临困难。为解决这些问题,我们提出了一种可扩展的基于LLM的单元测试生成方法。我们的方法包含两个关键步骤:第一步是上下文信息检索,它同时利用LLM和静态分析来收集与复杂被测方法相关的上下文信息;第二步是结合代码消除的迭代测试生成,该步骤反复为代码切片生成单元测试,跟踪已实现的覆盖率,并选择性移除已被覆盖的代码段。这一过程简化了测试任务,并缓解了因上下文过长导致的令牌限制或推理效能下降问题。通过对开源项目的综合评估,我们的方法在复杂方法上实现高覆盖率方面优于当前最先进的基于LLM和基于搜索的方法,证明了其有效性。