Own voice pickup technology for hearable devices facilitates communication in noisy environments. Own voice reconstruction (OVR) systems enhance the quality and intelligibility of the recorded noisy own voice signals. Since disturbances affecting the recorded own voice signals depend on individual factors, personalized OVR systems have the potential to outperform generic OVR systems. In this paper, we propose personalizing OVR systems through data augmentation and fine-tuning, comparing them to their generic counterparts. We investigate the influence of personalization on speech quality assessed by objective metrics and conduct a subjective listening test to evaluate quality under various conditions. In addition, we assess the prediction accuracy of the objective metrics by comparing predicted quality with subjectively measured quality. Our findings suggest that personalized OVR provides benefits over generic OVR for some talkers only. Our results also indicate that performance comparisons between systems are not always accurately predicted by objective metrics. In particular, certain disturbances lead to a consistent overestimation of quality compared to actual subjective ratings.
翻译:可听设备中的自身语音拾取技术有助于在嘈杂环境中进行通信。自身语音重建(OVR)系统能提升所录制的含噪自身语音信号的质量与可懂度。由于影响录制自身语音信号的干扰因素具有个体差异性,个性化OVR系统有望超越通用OVR系统。本文提出通过数据增强与微调实现OVR系统的个性化,并将其与通用系统进行对比。我们研究了个性化对客观指标评估的语音质量的影响,并通过主观听力测试评估了不同条件下的质量表现。此外,通过比较预测质量与主观实测质量,我们评估了客观指标的预测准确性。研究结果表明,个性化OVR仅对部分说话者相比通用OVR具有优势。同时发现,系统间的性能对比并不总能被客观指标准确预测。特别是某些干扰会导致质量评估结果相较于实际主观评分持续偏高。