Software misconfiguration has consistently been a major reason for software failures. Over the past two decades, much work has been done to detect and diagnose software misconfigurations. However, there is still a gap between real-world misconfigurations and the literature. It is desirable to investigate whether existing taxonomy and tools are applicable for real-world misconfigurations in modern software. In this paper, we conduct an empirical study on 772 real-world misconfiguration issues, based on which we propose a novel classification of the root causes of software misconfigurations, i.e., constraint violation, resource unavailability, component integration error, and configuration semantic misinterpretation. Then, we systematically review the literature on misconfiguration troubleshooting to study the trends of research and the practicality of the tools and datasets in this field. We find that the research targets have changed from system and infrastructure software to advanced applications (e.g., cloud service). Meanwhile, research on non-crash misconfigurations has also grown significantly. Despite the progress, a majority of studies lack reproducibility due to the unavailable tools and evaluation datasets. In total, only eleven tools and four datasets are publicly available. We analyze the trends of existing literature on misconfiguration troubleshooting, summarize the challenges that users are faced with, and highlight the suggestions to mitigate and diagnose software misconfigurations. We release the real-world dataset of misconfiguration issues for follow-up research.
翻译:软件配置错误一直是导致软件故障的主要原因。过去二十年中,已有大量工作致力于检测和诊断软件配置错误。然而,现实世界中的配置错误与文献之间仍存在差距。亟需探究现有分类体系和工具是否适用于现代软件中的真实配置错误。本文对772个真实配置错误问题进行了实证研究,在此基础上提出了一种新颖的软件配置错误根因分类,即:约束违反、资源不可用、组件集成错误和配置语义误解读。随后,我们系统性地回顾了配置错误故障排查相关文献,以研究该领域的研究趋势以及工具和数据集的实用性。我们发现研究目标已从系统与基础设施软件转向高级应用(例如云服务)。同时,针对非崩溃型配置错误的研究也显著增长。尽管取得了进展,但由于工具和评估数据集不可用,大多数研究缺乏可重复性。总共仅有十一个工具和四个数据集是公开可用的。我们分析了现有配置错误故障排查文献的趋势,总结了用户面临的挑战,并强调了缓解和诊断软件配置错误的建议。我们发布了真实世界的配置错误问题数据集以供后续研究使用。