Soundscape studies typically attempt to capture the perception and understanding of sonic environments by surveying users. However, for long-term monitoring or assessing interventions, sound-signal-based approaches are required. To this end, most previous research focused on psycho-acoustic quantities or automatic sound recognition. Few attempts were made to include appraisal (e.g., in circumplex frameworks). This paper proposes an artificial intelligence (AI)-based dual-branch convolutional neural network with cross-attention-based fusion (DCNN-CaF) to analyze automatic soundscape characterization, including sound recognition and appraisal. Using the DeLTA dataset containing human-annotated sound source labels and perceived annoyance, the DCNN-CaF is proposed to perform sound source classification (SSC) and human-perceived annoyance rating prediction (ARP). Experimental findings indicate that (1) the proposed DCNN-CaF using loudness and Mel features outperforms the DCNN-CaF using only one of them. (2) The proposed DCNN-CaF with cross-attention fusion outperforms other typical AI-based models and soundscape-related traditional machine learning methods on the SSC and ARP tasks. (3) Correlation analysis reveals that the relationship between sound sources and annoyance is similar for humans and the proposed AI-based DCNN-CaF model. (4) Generalization tests show that the proposed model's ARP in the presence of model-unknown sound sources is consistent with expert expectations and can explain previous findings from the literature on sound-scape augmentation.
翻译:声景研究通常通过调查用户来捕捉对声音环境的感知与理解。然而,对于长期监测或评估干预措施而言,需要基于声音信号的方法。为此,以往的研究主要关注心理声学指标或自动声音识别,鲜有尝试将评价维度(如环状框架)纳入其中。本文提出一种基于人工智能(AI)的双分支交叉注意力融合卷积神经网络(DCNN-CaF),用于实现包含声音识别与评价的自动声景特征分析。利用包含人工标注声源标签及感知烦扰度的DeLTA数据集,该DCNN-CaF被设计用于完成声源分类(SSC)与人类感知烦扰度评级预测(ARP)任务。实验结果表明:(1)同时采用响度与梅尔特征的DCNN-CaF性能优于仅使用单一特征的DCNN-CaF;(2)具有交叉注意力融合机制的DCNN-CaF在SSC与ARP任务上优于其他典型AI模型及传统声景相关机器学习方法;(3)相关分析揭示,人类与所提出的AI-DCNN-CaF模型对声源与烦扰度之间关系的认知具有相似性;(4)泛化测试表明,在存在模型未知声源的情况下,该模型的ARP预测结果与专家预期一致,并能解释既有文献中关于声景增强的研究发现。