Performance is arguably the most crucial attribute that reflects the quality of a configurable software system. However, given the increasing scale and complexity of modern software, modeling and predicting how various configurations can impact performance becomes one of the major challenges in software maintenance. As such, performance is often modeled without having a thorough knowledge of the software system, but relying mainly on data, which fits precisely with the purpose of deep learning. In this paper, we conduct a comprehensive review exclusively on the topic of deep learning for performance learning of configurable software, covering 1,206 searched papers spanning six indexing services, based on which 99 primary papers were extracted and analyzed. Our results outline key statistics, taxonomy, strengths, weaknesses, and optimal usage scenarios for techniques related to the preparation of configuration data, the construction of deep learning performance models, the evaluation of these models, and their utilization in various software configuration-related tasks.We also identify the good practices and potentially problematic phenomena from the studies surveyed, together with a comprehensive summary of actionable suggestions and insights into future opportunities within the field. To promote open science, all the raw results of this survey can be accessed at our repository: https://github.com/ideas-labo/DCPL-SLR.
翻译:性能无疑是反映可配置软件系统质量的最关键属性。然而,随着现代软件的规模和复杂性日益增长,建模和预测不同配置如何影响性能成为软件维护的主要挑战之一。因此,性能建模往往并非基于对软件系统的透彻理解,而主要依赖于数据——这恰好与深度学习的目标相契合。本文针对可配置软件性能学习的深度学习主题进行了全面综述,覆盖了跨越六个索引服务的1,206篇检索文献,并从中提取分析了99篇核心论文。我们的研究结果系统梳理了配置数据准备、深度学习性能模型构建、模型评估及其在各类软件配置相关任务中应用等方面的关键技术统计、分类体系、优势劣势及最佳适用场景。同时,我们从现有研究中归纳出良好实践与潜在问题现象,并提出了具有可操作性的建议及该领域未来机遇的深刻洞见。为促进开放科学,本综述所有原始数据可通过我们的代码库获取:https://github.com/ideas-labo/DCPL-SLR。