Software malleability allows applications to be easily changed, configured, and adapted even after deployment. While prior work has explored configurable systems, adaptive recommender systems, and malleable GUIs, these approaches are often tailored to specific software and lack generalizability. In this work, we envision per-user malleable mobile applications, where end-users can specify requirements that are automatically implemented via LLM-based code generation. However, realizing this vision requires overcoming the key challenge of designing automated test generation that can reliably verify both the presence and correctness of user-specified functionalities. We propose \tool, a user-requirement-driven GUI test generation framework that incrementally navigates the UI, triggers desired functionalities, and constructs LLM-guided oracles to validate correctness. We build a benchmark spanning six popular mobile applications with both correct and faulty user-requested functionalities, demonstrating that \tool effectively validates per-user features and is practical for real-world deployment. Our work highlights the feasibility of shifting mobile app development from a product-manager-driven to an end-user-driven paradigm.
翻译:软件可延展性允许应用程序即使在部署后也能轻松更改、配置和适配。虽然先前的工作已探索了可配置系统、自适应推荐系统和可延展图形用户界面,但这些方法通常针对特定软件量身定制,缺乏通用性。在本工作中,我们构想出面向每位用户的可延展移动应用,其中最终用户能够指定需求,这些需求将通过基于大语言模型的代码生成自动实现。然而,实现这一愿景需要克服关键挑战:设计能够可靠验证用户指定功能存在性及其正确性的自动化测试生成。我们提出 \tool 框架,这是一种用户需求驱动的GUI测试生成框架,它增量式地导航用户界面、触发所需功能,并构建大语言模型引导的预测器以验证正确性。我们构建了一个涵盖六款流行移动应用的基准测试集,其中包含正确及有缺陷的用户请求功能,结果表明 \tool 能有效验证面向每位用户的功能特性,并具备实际部署的实用性。我们的工作凸显了将移动应用开发从产品经理驱动范式转向最终用户驱动范式的可行性。