Perceptual audio quality measurement systems algorithmically analyze the output of audio processing systems to estimate possible perceived quality degradation using perceptual models of human audition. In this manner, they save the time and resources associated with the design and execution of listening tests (LTs). Models of disturbance audibility predicting peripheral auditory masking in quality measurement systems have considerably increased subjective quality prediction performance of signals processed by perceptual audio codecs. Additionally, cognitive effects have also been known to regulate perceived distortion severity by influencing their salience. However, the performance gains due to cognitive effect models in quality measurement systems were inconsistent so far, particularly for music signals. Firstly, this paper presents an improved model of informational masking (IM) -- an important cognitive effect in quality perception -- that considers disturbance information complexity around the masking threshold. Secondly, we incorporate the proposed IM metric into a quality measurement systems using a novel interaction analysis procedure between cognitive effects and distortion metrics. The procedure establishes interactions between cognitive effects and distortion metrics using LT data. The proposed IM metric is shown to outperform previously proposed IM metrics in a validation task against subjective quality scores from large and diverse LT databases. Particularly, the proposed system showed an increased quality prediction of music signals coded with bandwidth extension techniques, where other models frequently fail.
翻译:感知音频质量评估系统通过算法分析音频处理系统的输出,并利用人类听觉的感知模型来估计可能的质量退化程度。通过这种方式,它们节省了设计和执行听力测试(LTs)所需的时间和资源。在质量评估系统中,用于预测外围听觉掩蔽的干扰可听度模型显著提高了感知音频编解码器处理信号的主观质量预测性能。此外,已知认知效应通过影响干扰的显著性来调节感知失真程度。然而,迄今为止,认知效应模型在质量评估系统中带来的性能提升并不一致,尤其是对于音乐信号。首先,本文提出了一种改进的信息掩蔽(IM)度量——质量感知中一种重要的认知效应——该度量考虑了掩蔽阈值附近的干扰信息复杂度。其次,我们通过一种新颖的认知效应与失真度量之间的交互分析程序,将所提出的IM度量融入质量评估系统。该程序利用LT数据建立认知效应与失真度量之间的交互关系。结果表明,在针对大量多样化LT数据库的主观质量分数进行的验证任务中,所提出的IM度量优于先前提出的IM度量。特别是,所提出的系统对采用带宽扩展技术编码的音乐信号的质量预测能力有所提升,而其他模型在此类信号上常常表现不佳。