Large AI models trained on audio data may have the potential to rapidly classify patients, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets using expensive recording equipment in high-income, English-speaking countries. This challenges deployment in resource-constrained, high-volume settings where audio data may have a profound impact. This report introduces a novel data type and a corresponding collection system that captures health data through guided questions using only a mobile/web application. This application ultimately results in an audio electronic health record (voice EHR) which may contain complex biomarkers of health from conventional voice/respiratory features, speech patterns, and language with semantic meaning - compensating for the typical limitations of unimodal clinical datasets. This report introduces a consortium of partners for global work, presents the application used for data collection, and showcases the potential of informative voice EHR to advance the scalability and diversity of audio AI.
翻译:基于音频数据训练的大型AI模型可能具有快速分类患者的潜力,从而增强医疗决策,并通过早期检测改善预后。现有技术依赖于在高收入英语国家使用昂贵录音设备获取的有限数据集,这使得在资源受限、患者量大的环境中部署面临挑战,而音频数据在此类环境中可能产生深远影响。本报告介绍了一种新型数据类型及其配套采集系统,该系统通过仅使用移动/网络应用引导提问来捕获健康数据。该应用最终生成音频电子健康记录(voice EHR),其中可能包含源自常规语音/呼吸特征、言语模式及含语义的语言的复杂健康生物标志物,从而弥补单模态临床数据集的典型局限性。本报告介绍了致力于全球合作的合作伙伴联盟,展示了用于数据采集的应用,并揭示了富有信息量的voice EHR在推动音频AI可扩展性与多样性方面的潜力。