Within the ambit of VoIP (Voice over Internet Protocol) telecommunications, the complexities introduced by acoustic transformations merit rigorous analysis. This research, rooted in the exploration of proprietary sender-side denoising effects, meticulously evaluates platforms such as Google Meets and Zoom. The study draws upon the Deep Noise Suppression (DNS) 2020 dataset, ensuring a structured examination tailored to various denoising settings and receiver interfaces. A methodological novelty is introduced via the Oaxaca decomposition, traditionally an econometric tool, repurposed herein to analyze acoustic-phonetic perturbations within VoIP systems. To further ground the implications of these transformations, psychoacoustic metrics, specifically PESQ and STOI, were harnessed to furnish a comprehensive understanding of speech alterations. Cumulatively, the insights garnered underscore the intricate landscape of VoIP-influenced acoustic dynamics. In addition to the primary findings, a multitude of metrics are reported, extending the research purview. Moreover, out-of-domain benchmarking for both time and time-frequency domain speech enhancement models is included, thereby enhancing the depth and applicability of this inquiry.
翻译:在VoIP(互联网语音协议)通信范畴内,声学变换引入的复杂性值得进行严谨分析。本研究根植于对专有发送端降噪效应的探索,细致评估了Google Meets和Zoom等平台。研究采用Deep Noise Suppression (DNS) 2020数据集,确保针对不同降噪设置和接收端接口进行结构化检验。方法学上引入了一项创新:传统上用于计量经济学的Oaxaca分解法,在此被重新用于分析VoIP系统中的声学-语音扰动。为进一步夯实这些变换的影响,研究者利用心理声学指标(特别是PESQ和STOI)来全面理解语音改变。综合而言,所获见解凸显了VoIP影响下声学动态的复杂图景。除主要发现外,本文还报告了大量指标以扩展研究范围。此外,针对时域和时频域语音增强模型,本文纳入了域外基准测试,从而增强了本研究的深度与适用性。