Anomalous sound detection (ASD) benchmarks typically assume that the identity of the monitored machine is known at test time and that recordings are evaluated in a machine-wise manner. However, in realistic monitoring scenarios with multiple known machines operating concurrently, test recordings may not be reliably attributable to a specific machine, and requiring machine identity imposes deployment constraints such as dedicated sensors per machine. To reveal performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, we consider a minimal modification of the ASD evaluation protocol in which test recordings from multiple machines are merged and evaluated jointly without access to machine identity at inference time. Training data and evaluation metrics remain unchanged, and machine identity labels are used only for post hoc evaluation. Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.
翻译:异常声音检测(ASD)基准测试通常假设测试时被监测机器的身份已知,且录音以逐台机器的方式进行评估。然而,在多个已知机器同时运行的实际监测场景中,测试录音可能无法可靠地归属于特定机器,而要求提供机器身份则带来了部署限制,例如每台机器需配备专用传感器。为揭示在标准逐机评估下被隐藏的性能下降及方法间鲁棒性的具体差异,我们考虑对ASD评估协议进行一项最小修改:在推理时无需机器身份信息的情况下,将来自多台机器的测试录音合并并进行联合评估。训练数据和评估指标保持不变,机器身份标签仅用于事后评估。对代表性ASD方法的实验表明,放宽此假设揭示了在标准逐机评估下被隐藏的性能下降及方法间鲁棒性的具体差异,且这些性能下降与隐式的机器识别准确度密切相关。