Software systems log massive amounts of data, recording important runtime information. Such logs are used, for example, for log-based anomaly detection, which aims to automatically detect abnormal behaviors of the system under analysis by processing the information recorded in its logs. Many log-based anomaly detection techniques based on deep-learning models include a pre-processing step called log parsing. However, understanding the impact of log parsing on the accuracy of anomaly detection techniques has received surprisingly little attention so far. Investigating what are the key properties log parsing techniques should ideally have to help anomaly detection is therefore warranted. In this paper, we report on a comprehensive empirical study on the impact of log parsing on anomaly detection accuracy, using 13 log parsing techniques and five deep-learning-based anomaly detection techniques on two publicly available log datasets. Our empirical results show that, despite what is widely assumed, there is no strong correlation between log parsing accuracy and anomaly detection accuracy (regardless of the metric used for measuring log parsing accuracy). Moreover, we experimentally confirm existing theoretical results showing that it is a property that we refer to as distinguishability in log parsing results as opposed to their accuracy that plays an essential role in achieving accurate anomaly detection.
翻译:软件系统记录海量数据,存储重要的运行时信息。例如,此类日志用于基于日志的异常检测,该检测旨在通过处理系统日志中记录的信息来自动识别被分析系统中的异常行为。许多基于深度学习模型的日志异常检测技术都包含一个称为日志解析的预处理步骤。然而,迄今为止,关于日志解析对异常检测技术准确性的影响,所受到的关注却少得惊人。因此,有必要探究日志解析技术为辅助异常检测而应具备的关键属性。本文基于两个公开日志数据集,采用13种日志解析技术和五种基于深度学习的异常检测技术,对日志解析对异常检测准确性的影响进行了全面的实证研究。实验结果表明,与普遍假设相反,日志解析准确性与异常检测准确性之间并无强相关性(无论使用何种度量标准来衡量日志解析准确性)。此外,我们通过实验验证了已有理论结果,表明在日志解析结果中起关键作用的是一种我们称为“可区分性”的属性,而非其准确性,这对于实现准确的异常检测至关重要。