Speech is promising as an objective, convenient tool to monitor health remotely over time using mobile devices. Numerous paralinguistic features have been demonstrated to contain salient information related to an individual's health. However, mobile device specification and acoustic environments vary widely, risking the reliability of the extracted features. In an initial step towards quantifying these effects, we report the variability of 13 exemplar paralinguistic features commonly reported in the speech-health literature and extracted from the speech of 42 healthy volunteers recorded consecutively in rooms with low and high reverberation with one budget and two higher-end smartphones and a condenser microphone. Our results show reverberation has a clear effect on several features, in particular voice quality markers. They point to new research directions investigating how best to record and process in-the-wild speech for reliable longitudinal health state assessment.
翻译:语音作为一种客观、便捷的工具,有望通过移动设备长期远程监测健康状况。大量副语言特征已被证明包含与个体健康相关的显著信息。然而,移动设备规格和声学环境差异较大,可能危及所提取特征的可靠性。作为量化这些影响的初步步骤,我们报告了13个典型副语言特征的变异性——这些特征常见于语音-健康文献中,从42名健康志愿者的语音中提取,这些语音依次在低混响和高混响房间中使用一部平价智能手机、两部高端智能手机及电容麦克风录制。结果表明,混响对多个特征有显著影响,尤其是声音质量标记。这指出了新的研究方向:如何最佳地记录和处理自然状态下的语音,以实现可靠的纵向健康状态评估。