Large AI models trained on audio data may have the potential to rapidly classify patients, enhancing medical decision-making and potentially improving outcomes through early detection. Existing technologies depend on limited datasets using expensive recording equipment in high-income, English-speaking countries. This challenges deployment in resource-constrained, high-volume settings where audio data may have a profound impact. This report introduces a novel data type and a corresponding collection system that captures health data through guided questions using only a mobile/web application. This application ultimately results in an audio electronic health record (voice EHR) which may contain complex biomarkers of health from conventional voice/respiratory features, speech patterns, and language with semantic meaning - compensating for the typical limitations of unimodal clinical datasets. This report introduces a consortium of partners for global work, presents the application used for data collection, and showcases the potential of informative voice EHR to advance the scalability and diversity of audio AI.
翻译:基于音频数据训练的大型AI模型可能具备快速分类患者的潜力,从而增强医疗决策能力,并可能通过早期检测改善预后。现有技术依赖于在高收入英语国家使用昂贵录音设备采集的有限数据集,这给在资源受限、高负荷环境中部署音频数据技术带来了挑战,而音频数据在这些环境中可能产生深远影响。本报告介绍了一种新颖的数据类型及相应的采集系统,该系统仅通过移动/网络应用程序,利用引导性问题采集健康数据。该应用程序最终生成音频电子健康记录(语音EHR),其中可能包含来自常规语音/呼吸特征、言语模式和具有语义意义的语言的复杂健康生物标志物,从而弥补了单模态临床数据集通常存在的局限性。本报告介绍了开展全球工作的合作联盟,展示了用于数据采集的应用程序,并展现了信息丰富的语音EHR在提升音频AI可扩展性和多样性方面的潜力。