Performance is arguably the most crucial attribute that reflects the behavior of a configurable software system. However, given the increasing scale and complexity of modern software, modeling and predicting how various configurations can impact performance becomes one of the major challenges in software maintenance. As such, performance is often modeled without having a thorough knowledge of the software system, but relying mainly on data, which fits precisely with the purpose of deep learning. In this paper, we conduct a comprehensive review exclusively on the topic of deep learning for performance learning of configurable software, covering 948 searched papers spanning six indexing services, based on which 85 primary papers were extracted and analyzed. Our results summarize the key topics and statistics on how the configuration data is prepared; how the deep configuration performance learning model is built; how the model is evaluated and how they are exploited in different tasks related to software configuration. We also identify the good practice and the potentially problematic phenomena from the studies surveyed, together with insights on future opportunities for the field. To promote open science, all the raw results of this survey can be accessed at our repository: https://github.com/ideas-labo/DCPL-SLR.
翻译:性能无疑是反映可配置软件系统行为最为关键的属性。然而,随着现代软件规模和复杂性的日益增长,如何建模并预测不同配置对性能的影响,已成为软件维护中的主要挑战之一。因此,性能建模往往无需深入了解软件系统本身,而是主要依赖数据——这恰好符合深度学习的目标。本文针对可配置软件的深度学习性能学习这一主题进行了全面综述,涵盖了来自六个索引服务的948篇检索论文,并从中提取分析了85篇核心文献。我们的研究总结出以下关键主题与统计数据:配置数据的准备方式、深度配置性能学习模型的构建方法、模型的评估方式,以及它们在软件配置相关各类任务中的具体应用。我们还从所调研的研究中识别出良好实践与潜在问题现象,并提出了该领域的未来机遇。为促进开放科学,本综述的所有原始结果均可通过以下仓库获取:https://github.com/ideas-labo/DCPL-SLR。