State-of-the-art anomalous sound detection (ASD) systems in domain-shifted conditions rely on projecting audio signals into an embedding space and using distance-based outlier detection to compute anomaly scores. One of the major difficulties to overcome is the so-called domain mismatch between the anomaly score distributions of a source domain and a target domain that differ acoustically and in terms of the amount of training data provided. A decision threshold that is optimal for one domain may be highly sub-optimal for the other domain and vice versa. This significantly degrades the performance when only using a single decision threshold, as is required when generalizing to multiple data domains that are possibly unseen during training while still using the same trained ASD system as in the source domain. To reduce this mismatch between the domains, we propose a simple local-density-based anomaly score normalization scheme. In experiments conducted on several ASD datasets, we show that the proposed normalization scheme consistently improves performance for various types of embedding-based ASD systems and yields better results than existing anomaly score normalization approaches.
翻译:在领域偏移条件下,最先进的异常声音检测(ASD)系统依赖于将音频信号投影到嵌入空间,并利用基于距离的离群点检测来计算异常分数。需要克服的主要困难之一是源域与目标域之间异常分数分布的所谓领域不匹配问题,这些领域在声学特性及提供的训练数据量上均存在差异。对某一领域最优的决策阈值可能对另一领域高度次优,反之亦然。当仅使用单一决策阈值时(如在训练中可能未见过的多个数据领域进行泛化时,仍需使用与源域相同的已训练ASD系统),这会显著降低性能。为减少领域间的不匹配,我们提出了一种基于局部密度的简单异常分数归一化方案。在多个ASD数据集上进行的实验表明,所提出的归一化方案能持续提升各类基于嵌入的ASD系统性能,并取得优于现有异常分数归一化方法的结果。