Anomalous sound detection (ASD) benchmarks typically assume that the identity of the monitored machine is known at test time and that recordings are evaluated in a machine-wise manner. However, in realistic monitoring scenarios with multiple known machines operating concurrently, test recordings may not be reliably attributable to a specific machine, and requiring machine identity imposes deployment constraints such as dedicated sensors per machine. To reveal performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, we consider a minimal modification of the ASD evaluation protocol in which test recordings from multiple machines are merged and evaluated jointly without access to machine identity at inference time. Training data and evaluation metrics remain unchanged, and machine identity labels are used only for post hoc evaluation. Experiments with representative ASD methods show that relaxing this assumption reveals performance degradations and method-specific differences in robustness that are hidden under standard machine-wise evaluation, and that these degradations are strongly related to implicit machine identification accuracy.
翻译:异常声音检测(ASD)基准测试通常假设测试时已知被监测机器的身份,并且录音以机器为单位进行评估。然而,在多个已知机器同时运行的真实监测场景中,测试录音可能无法可靠地归因于特定机器,而要求机器身份会带来部署限制,例如每台机器需配备专用传感器。为了揭示标准逐机器评估中隐藏的性能下降和方法特定的鲁棒性差异,我们考虑对ASD评估协议进行最小化修改:将来自多台机器的测试录音合并,并在推理时无法获取机器身份的情况下进行联合评估。训练数据和评估指标保持不变,机器身份标签仅用于事后评估。使用代表性ASD方法的实验表明,放松这一假设会揭示标准逐机器评估中隐藏的性能下降和方法特定的鲁棒性差异,并且这些下降与隐式机器身份识别的准确性密切相关。