Artificial intelligence (AI) models trained on audio data may have the potential to rapidly perform clinical tasks, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets collected with expensive recording equipment in high-income countries, which challenges deployment in resource-constrained, high-volume settings where audio data may have a profound impact on health equity. This report introduces a novel data type and a corresponding collection system that captures health data through guided questions using only a mobile/web application. The app facilitates the collection of an audio electronic health record (Voice EHR) which may contain complex biomarkers of health from conventional voice/respiratory features, speech patterns, and spoken language with semantic meaning and longitudinal context, potentially compensating for the typical limitations of unimodal clinical datasets. This report presents the application used for data collection, initial experiments on data quality, and case studies which demonstrate the potential of voice EHR to advance the scalability/diversity of audio AI.
翻译:基于音频数据训练的人工智能(AI)模型有望快速执行临床任务,通过早期检测增强医疗决策能力并可能改善健康结果。现有技术依赖于高收入国家使用昂贵录音设备采集的有限数据集,这在资源受限、高负荷的应用场景中面临部署挑战,而这些场景中音频数据可能对健康公平性产生深远影响。本报告介绍了一种新型数据类型及相应的采集系统,该系统仅通过移动端/网页应用程序中的引导式问题即可采集健康数据。该应用支持构建音频电子健康记录(Voice EHR),其中可能包含来自常规语音/呼吸特征、言语模式以及具有语义信息和纵向背景的口头语言的复杂健康生物标志物,有望弥补单模态临床数据集的典型局限性。本报告介绍了用于数据采集的应用程序、数据质量的初步实验案例研究,这些案例展示了语音电子健康档案在提升音频AI可扩展性与多样性方面的潜力。