Spatial audio enhances immersion in applications such as virtual reality, augmented reality, gaming, and cinema by creating a three-dimensional auditory experience. Ensuring the spatial fidelity of binaural audio is crucial, given that processes such as compression, encoding, or transmission can alter localization cues. While subjective listening tests like MUSHRA remain the gold standard for evaluating spatial localization quality, they are costly and time-consuming. This paper introduces BINAQUAL, a full-reference objective metric designed to assess localization similarity in binaural audio recordings. BINAQUAL adapts the AMBIQUAL metric, originally developed for localization quality assessment in ambisonics audio format to the binaural domain. We evaluate BINAQUAL across five key research questions, examining its sensitivity to variations in sound source locations, angle interpolations, surround speaker layouts, audio degradations, and content diversity. Results demonstrate that BINAQUAL effectively differentiates between subtle spatial variations and correlates strongly with subjective listening tests, making it a reliable metric for binaural localization quality assessment. The proposed metric provides a robust benchmark for ensuring spatial accuracy in binaural audio processing, paving the way for improved objective evaluations in immersive audio applications.
翻译:空间音频通过创造三维听觉体验,增强了虚拟现实、增强现实、游戏和电影等应用中的沉浸感。鉴于压缩、编码或传输等过程可能改变定位线索,确保双耳音频的空间保真度至关重要。虽然如MUSHRA之类的主观听力测试仍是评估空间定位质量的金标准,但其成本高昂且耗时。本文介绍了BINAQUAL,一种专为评估双耳音频录音中定位相似度而设计的全参考客观度量。BINAQUAL将最初为高阶Ambisonics音频格式定位质量评估开发的AMBIQUAL度量适配至双耳领域。我们围绕五个关键研究问题对BINAQUAL进行评估,检验其对声源位置变化、角度插值、环绕扬声器布局、音频退化以及内容多样性的敏感性。结果表明,BINAQUAL能有效区分细微的空间变化,并与主观听力测试表现出强相关性,使其成为双耳定位质量评估的可靠度量。所提出的度量为确保双耳音频处理的空间准确性提供了稳健的基准,为改进沉浸式音频应用中的客观评估铺平了道路。