Failure data collected from the field (e.g., failure traces, bug reports, and memory dumps) represent an invaluable source of information for developers who need to reproduce and analyze failures. Unfortunately, field data may include sensitive information and thus cannot be collected indiscriminately. Privacy-preserving techniques can address this problem anonymizing data and reducing the risk of disclosing personal information. However, collecting anonymized information may harm reproducibility, that is, the anonymized data may not allow the reproduction of a failure observed in the field. In this paper, we present an empirical investigation about the impact of privacy-preserving techniques on the reproducibility of failures. In particular, we study how five privacy-preserving techniques may impact reproducibilty for 19 bugs in 17 Android applications. Results provide insights on how to select and configure privacy-preserving techniques.
翻译:从实际场景采集的故障数据(如故障轨迹、错误报告和内存转储)为开发人员复现和分析故障提供了宝贵的信息来源。然而,现场数据可能包含敏感信息,因此无法不加区分地采集。隐私保护技术可通过数据匿名化降低个人信息泄露风险来解决这一问题。但采集匿名化信息可能损害故障复现能力——即匿名化后的数据可能无法复现实际场景中观察到的故障。本文通过实证研究探讨隐私保护技术对故障复现能力的影响。具体而言,我们针对17个安卓应用中的19个漏洞,考察了五种隐私保护技术对故障复现性的影响程度。研究结果为如何选择和配置隐私保护技术提供了重要参考。