We present a novel, simple and widely applicable semi-supervised procedure for anomaly detection in industrial and IoT environments, SAnD (Simple Anomaly Detection). SAnD comprises 5 steps, each leveraging well-known statistical tools, namely; smoothing filters, variance inflation factors, the Mahalanobis distance, threshold selection algorithms and feature importance techniques. To our knowledge, SAnD is the first procedure that integrates these tools to identify anomalies and help decipher their putative causes. We show how each step contributes to tackling technical challenges that practitioners face when detecting anomalies in industrial contexts, where signals can be highly multicollinear, have unknown distributions, and intertwine short-lived noise with the long(er)-lived actual anomalies. The development of SAnD was motivated by a concrete case study from our industrial partner, which we use here to show its effectiveness. We also evaluate the performance of SAnD by comparing it with a selection of semi-supervised methods on public datasets from the literature on anomaly detection. We conclude that SAnD is effective, broadly applicable, and outperforms existing approaches in both anomaly detection and runtime.
翻译:我们提出了一种新颖、简单且广泛适用的半监督异常检测方法SAnD(Simple Anomaly Detection),适用于工业和物联网环境。SAnD包含5个步骤,每个步骤均利用成熟的统计工具,包括平滑滤波器、方差膨胀因子、马哈拉诺比斯距离、阈值选择算法和特征重要性技术。据我们所知,SAnD是首个整合这些工具来识别异常并帮助解析其潜在原因的流程。我们展示了每个步骤如何应对工业场景中检测异常时面临的技术挑战,包括信号高度多重共线性、分布未知,以及短期噪声与较长期真实异常交织等问题。SAnD的研发源于我们工业合作伙伴的具体案例研究,本文通过该案例验证了其有效性。此外,我们将SAnD与文献中多种半监督方法在公开异常检测数据集上进行了性能对比。结论表明,SAnD具有高效性、广泛适用性,在异常检测能力和运行时间上均优于现有方法。