Nearly a decade of research in software engineering has focused on automating mobile app testing to help engineers in overcoming the unique challenges associated with the software platform. Much of this work has come in the form of Automated Input Generation tools (AIG tools) that dynamically explore app screens. However, such tools have repeatedly been demonstrated to achieve lower-than-expected code coverage - particularly on sophisticated proprietary apps. Prior work has illustrated that a primary cause of these coverage deficiencies is related to so-called tarpits, or complex screens that are difficult to navigate. In this paper, we take a critical step toward enabling AIG tools to effectively navigate tarpits during app exploration through a new form of automated semantic screen understanding. We introduce AURORA, a technique that learns from the visual and textual patterns that exist in mobile app UIs to automatically detect common screen designs and navigate them accordingly. The key idea of AURORA is that there are a finite number of mobile app screen designs, albeit with subtle variations, such that the general patterns of different categories of UI designs can be learned. As such, AURORA employs a multi-modal, neural screen classifier that is able to recognize the most common types of UI screen designs. After recognizing a given screen, it then applies a set of flexible and generalizable heuristics to properly navigate the screen. We evaluated AURORA both on a set of 12 apps with known tarpits from prior work, and on a new set of five of the most popular apps from the Google Play store. Our results indicate that AURORA is able to effectively navigate tarpit screens, outperforming prior approaches that avoid tarpits by 19.6% in terms of method coverage. The improvements can be attributed to AURORA's UI design classification and heuristic navigation techniques.
翻译:近十年来,软件工程领域的研究一直聚焦于自动化移动应用测试,以帮助工程师应对该软件平台特有的挑战。其中大部分工作以自动输入生成工具的形式呈现,这些工具能够动态探索应用屏幕。然而,此类工具反复被证明在实现代码覆盖率方面低于预期——尤其是在复杂的专有应用上。先前研究表明,导致这些覆盖率不足的主要原因与所谓的"陷阱"或难以导航的复杂屏幕有关。本文中,我们通过一种新型自动语义屏幕理解技术,朝着使自动输入生成工具在应用探索过程中有效导航陷阱迈出了关键一步。我们提出AURORA技术,该技术从移动应用用户界面中存在的视觉和文本模式中学习,以自动检测常见屏幕设计并相应地导航它们。AURORA的核心思想在于,移动应用屏幕设计的种类是有限的(尽管存在细微变化),因此可以学习不同类别用户界面设计的通用模式。基于此,AURORA采用一种多模态神经屏幕分类器,能够识别最常见的用户界面屏幕设计类型。识别给定屏幕后,它应用一组灵活且可推广的启发式规则来正确导航该屏幕。我们在先前工作中已知存在陷阱的12个应用以及来自Google Play商店的5个最流行应用组成的新数据集上评估了AURORA。结果表明,AURORA能够有效导航陷阱屏幕,在方法覆盖率上比先前避开陷阱的方法高出19.6%。这些改进可归因于AURORA的用户界面设计分类和启发式导航技术。