Sperm whales (Physeter macrocephalus) navigate underwater with a series of impulsive, click-like sounds known as echolocation clicks. These clicks are characterized by a multipulse structure (MPS) that serves as a distinctive pattern. In this work, we use the stability of the MPS as a detection metric for recognizing and classifying the presence of clicks in noisy environments. To distinguish between noise transients and to handle simultaneous emissions from multiple sperm whales, our approach clusters a time series of MPS measures while removing potential clicks that do not fulfil the limits of inter-click interval, duration and spectrum. As a result, our approach can handle high noise transients and low signal-to-noise ratio. The performance of our detection approach is examined using three datasets: seven months of recordings from the Mediterranean Sea containing manually verified ambient noise; several days of manually labelled data collected from the Dominica Island containing approximately 40,000 clicks from multiple sperm whales; and a dataset from the Bahamas containing 1,203 labelled clicks from a single sperm whale. Comparing with the results of two benchmark detectors, a better trade-off between precision and recall is observed as well as a significant reduction in false detection rates, especially in noisy environments. To ensure reproducibility, we provide our database of labelled clicks along with our implementation code.
翻译:抹香鲸(Physeter macrocephalus)通过一系列脉冲式的咔嗒样声音(称为回声定位咔嗒声)在水下导航。这些咔嗒声以多脉冲结构(MPS)为特征,形成独特的模式。在本工作中,我们利用MPS的稳定性作为检测指标,用于识别和分类嘈杂环境中咔嗒声的存在。为区分噪声瞬变并处理多只抹香鲸同时发出的信号,我们的方法对MPS测量值的时间序列进行聚类,同时剔除不满足咔嗒间隔、持续时间和频谱范围限制的潜在咔嗒声。因此,该方法能够应对高噪声瞬变和低信噪比场景。我们使用三个数据集检验检测方法的性能:来自地中海、包含人工验证的环境噪声的七个月录音;在多米尼加岛采集的、包含约40,000次多只抹香鲸咔嗒声的多天人工标注数据;以及来自巴哈马的、包含单只抹香鲸1,203次标注咔嗒声的数据集。与两种基准检测器的结果相比,本方法在精确率和召回率之间实现了更好的平衡,尤其在嘈杂环境中显著降低了误检率。为确保可重复性,我们提供了标注咔嗒声数据库及实现代码。