Sound event detection (SED) is an active area of audio research that aims to detect the temporal occurrence of sounds. In this paper, we apply SED to engine fault detection by introducing a multimodal SED framework that detects fine-grained engine faults of automobile engines using audio and accelerometer-recorded vibration. We first introduce the problem of engine fault SED on a dataset collected from a large variety of vehicles with expertly-labeled engine fault sound events. Next, we propose a SED model to temporally detect ten fine-grained engine faults that occur within vehicle engines and further explore a pretraining strategy using a large-scale weakly-labeled engine fault dataset. Through multiple evaluations, we show our proposed framework is able to effectively detect engine fault sound events. Finally, we investigate the interaction and characteristics of each modality and show that fusing features from audio and vibration improves overall engine fault SED capabilities.
翻译:声音事件检测(SED)是音频研究中的一个活跃领域,旨在检测声音的时间发生情况。在本文中,我们通过引入一个多模态SED框架,将SED应用于发动机故障检测,该框架利用音频和加速度计记录的振动来检测汽车发动机的细粒度故障。我们首先在一个涵盖多种车辆并带有专家标注发动机故障声音事件的数据集上,定义了发动机故障SED问题。接着,我们提出一个SED模型,用于时间上检测车辆发动机内发生的十种细粒度发动机故障,并进一步探索了一种利用大规模弱标注发动机故障数据集的预训练策略。通过多项评估,我们展示了所提出的框架能够有效检测发动机故障声音事件。最后,我们研究了每种模态的交互作用与特性,并证明融合音频和振动的特征能够提升整体发动机故障SED能力。