Mobile applications have become a ubiquitous part of our daily life, providing users with access to various services and utilities. Text input, as an important interaction channel between users and applications, plays an important role in core functionality such as search queries, authentication, messaging, etc. However, certain special text (e.g., -18 for Font Size) can cause the app to crash, and generating diversified unusual inputs for fully testing the app is highly demanded. Nevertheless, this is also challenging due to the combination of explosion dilemma, high context sensitivity, and complex constraint relations. This paper proposes InputBlaster which leverages the LLM to automatically generate unusual text inputs for mobile app crash detection. It formulates the unusual inputs generation problem as a task of producing a set of test generators, each of which can yield a batch of unusual text inputs under the same mutation rule. In detail, InputBlaster leverages LLM to produce the test generators together with the mutation rules serving as the reasoning chain, and utilizes the in-context learning schema to demonstrate the LLM with examples for boosting the performance. InputBlaster is evaluated on 36 text input widgets with cash bugs involving 31 popular Android apps, and results show that it achieves 78% bug detection rate, with 136% higher than the best baseline. Besides, we integrate it with the automated GUI testing tool and detect 37 unseen crashes in real-world apps from Google Play.
翻译:移动应用已成为日常生活中不可或缺的组成部分,为用户提供各类服务与实用功能。文本输入作为用户与应用交互的重要通道,在搜索查询、身份认证、信息传递等核心功能中发挥着关键作用。然而,某些特殊文本(如字体大小设为-18)可能导致应用崩溃,因此迫切需要生成多样化的异常输入以全面测试应用。由于组合爆炸困境、高度上下文敏感性和复杂约束关系的相互交织,这一任务极具挑战性。本文提出InputBlaster方法,利用大语言模型自动生成异常文本输入以检测移动应用崩溃。该方法将异常输入生成问题形式化为产生一组测试生成器的任务,每个生成器能在同一变异规则下生成批量异常文本输入。具体而言,InputBlaster通过大语言模型生成以变异规则为推理链的测试生成器,并采用上下文学习范式通过示例增强模型性能。我们在涉及31个流行Android应用的36个含现金漏洞的文本输入控件上评估InputBlaster,结果显示其漏洞检测率达78%,较最优基线提升136%。此外,我们将其与自动化GUI测试工具集成,在Google Play真实应用场景中检测到37个未发现的崩溃。