This paper describes Meta's TestGen-LLM tool, which uses LLMs to automatically improve existing human-written tests. TestGen-LLM verifies that its generated test classes successfully clear a set of filters that assure measurable improvement over the original test suite, thereby eliminating problems due to LLM hallucination. We describe the deployment of TestGen-LLM at Meta test-a-thons for the Instagram and Facebook platforms. In an evaluation on Reels and Stories products for Instagram, 75% of TestGen-LLM's test cases built correctly, 57% passed reliably, and 25% increased coverage. During Meta's Instagram and Facebook test-a-thons, it improved 11.5% of all classes to which it was applied, with 73% of its recommendations being accepted for production deployment by Meta software engineers. We believe this is the first report on industrial scale deployment of LLM-generated code backed by such assurances of code improvement.
翻译:本文描述了Meta的TestGen-LLM工具,该工具利用大型语言模型(LLM)自动改进现有的人工编写测试用例。TestGen-LLM会验证其生成的测试类是否通过一系列过滤器,确保相比原始测试套件具有可量化的改进效果,从而消除因LLM幻觉引发的问题。我们阐述了该工具在Meta为Instagram和Facebook平台举办的测试马拉松中的部署情况。在针对Instagram Reels和Stories产品的评估中,TestGen-LLM生成的测试用例中75%构建正确,57%稳定通过测试,25%提升了代码覆盖率。在Meta的Instagram和Facebook测试马拉松期间,该工具对11.5%的应用目标类实现了改进,其中73%的改进建议被Meta软件工程师采纳并投入生产部署。我们相信,这是首个基于代码改进保障的大规模工业级LLM生成代码部署报告。