Harnessing the power of Artificial Intelligence (AI) and m-health towards detecting new bio-markers indicative of the onset and progress of respiratory abnormalities/conditions has greatly attracted the scientific and research interest especially during COVID-19 pandemic. The smarty4covid dataset contains audio signals of cough (4,676), regular breathing (4,665), deep breathing (4,695) and voice (4,291) as recorded by means of mobile devices following a crowd-sourcing approach. Other self reported information is also included (e.g. COVID-19 virus tests), thus providing a comprehensive dataset for the development of COVID-19 risk detection models. The smarty4covid dataset is released in the form of a web-ontology language (OWL) knowledge base enabling data consolidation from other relevant datasets, complex queries and reasoning. It has been utilized towards the development of models able to: (i) extract clinically informative respiratory indicators from regular breathing records, and (ii) identify cough, breath and voice segments in crowd-sourced audio recordings. A new framework utilizing the smarty4covid OWL knowledge base towards generating counterfactual explanations in opaque AI-based COVID-19 risk detection models is proposed and validated.
翻译:利用人工智能(AI)与移动医疗(m-health)的力量,检测预示呼吸异常/病症发生与进展的新型生物标志物,已极大吸引科研界的兴趣——尤其在COVID-19大流行期间。smarty4covid数据集包含通过众包方式由移动设备记录的咳嗽(4,676段)、规律呼吸(4,665段)、深呼吸(4,695段)与语音(4,291段)音频信号,同时收录其他自我报告信息(如COVID-19病毒检测结果),为开发COVID-19风险检测模型提供综合性数据集。该数据集以Web本体语言(OWL)知识库形式发布,支持来自其他相关数据集的数据整合、复杂查询与推理。该数据集已被用于开发具备以下能力的模型:(i)从规律呼吸记录中提取临床信息丰富的呼吸指标;(ii)识别众包音频记录中的咳嗽、呼吸与语音片段。本文提出并验证了一种新型框架,该框架利用smarty4covid OWL知识库,为不透明的AI驱动COVID-19风险检测模型生成反事实解释。