Speech is promising as an objective, convenient tool to monitor health remotely over time using mobile devices. Numerous paralinguistic features have been demonstrated to contain salient information related to an individual's health. However, mobile device specification and acoustic environments vary widely, risking the reliability of the extracted features. In an initial step towards quantifying these effects, we report the variability of 13 exemplar paralinguistic features commonly reported in the speech-health literature and extracted from the speech of 42 healthy volunteers recorded consecutively in rooms with low and high reverberation with one budget and two higher-end smartphones and a condenser microphone. Our results show reverberation has a clear effect on several features, in particular voice quality markers. They point to new research directions investigating how best to record and process in-the-wild speech for reliable longitudinal health state assessment.
翻译:语音作为一种客观、便捷的工具,有望通过移动设备远程长期监测健康状况。大量副语言特征已被证明包含与个体健康相关的重要信息。然而,移动设备规格和声学环境差异巨大,可能导致提取特征的可靠性受损。为量化这些影响,我们迈出了初步步骤,报告了13个常见副语言特征的变异性——这些特征频繁见于语音健康相关文献中,由42名健康志愿者在低混响和高混响房间中依次使用一款低端、两款高端智能手机以及一款电容麦克风录制并提取。结果显示,混响对多项特征(尤其是声音质量标记)有显著影响。这为探索如何最佳地记录和处理自然场景下的语音以实现可靠的纵向健康状况评估指出了新的研究方向。