Context: Crowdsourced testing has gained popularity in software testing, especially for mobile app testing, due to its ability to bring diversity and tackle fragmentation issues. However, the openness of crowdsourced testing presents challenges, particularly in the manual review of numerous test reports, which is time-consuming and labor-intensive. Objective: The primary goal of this research is to improve the efficiency of review processes in crowdsourced testing. Traditional approaches to test report prioritization lack a deep understanding of semantic information in textual descriptions of these reports. This paper introduces LLMPrior, a novel approach for prioritizing crowdsourced test reports using large language models (LLMs). Method: LLMPrior leverages LLMs for the analysis and clustering of crowdsourced test reports based on the types of bugs revealed in their textual descriptions. This involves using prompt engineering techniques to enhance the performance of LLMs. Following the clustering, a recurrent selection algorithm is applied to prioritize the reports. Results: Empirical experiments are conducted to evaluate the effectiveness of LLMPrior. The findings indicate that LLMPrior not only surpasses current state-of-the-art approaches in terms of performance but also proves to be more feasible, efficient, and reliable. This success is attributed to the use of prompt engineering techniques and the cluster-based prioritization strategy. Conclusion: LLMPrior represents a significant advancement in crowdsourced test report prioritization. By effectively utilizing large language models and a cluster-based strategy, it addresses the challenges in traditional prioritization approaches, offering a more efficient and reliable solution for app developers dealing with crowdsourced test reports.
翻译:背景:众包测试因其能够带来多样性并解决碎片化问题,在软件测试领域(尤其是移动应用测试中)日益普及。然而,众包测试的开放性也带来了挑战,特别是在大量测试报告的人工审查方面,这一过程耗时且费力。目标:本研究的主要目标是提升众包测试中审查流程的效率。传统的测试报告优先级排序方法缺乏对报告中文本描述语义信息的深入理解。本文提出了LLMPrior,一种利用大型语言模型(LLMs)对众包测试报告进行优先级排序的新方法。方法:LLMPrior利用LLMs,基于测试报告文本描述中揭示的缺陷类型,对众包测试报告进行分析和聚类。这涉及使用提示工程技术来提升LLMs的性能。在聚类之后,应用一种循环选择算法对报告进行优先级排序。结果:通过实证实验评估了LLMPrior的有效性。研究结果表明,LLMPrior不仅在性能上超越了当前最先进的方法,而且被证明更具可行性、高效性和可靠性。这一成功归功于提示工程技术的使用以及基于聚类的优先级排序策略。结论:LLMPrior代表了众包测试报告优先级排序领域的重大进展。通过有效利用大型语言模型和基于聚类的策略,它解决了传统优先级排序方法面临的挑战,为处理众包测试报告的应用程序开发者提供了一个更高效、更可靠的解决方案。